ADAPTIVE FRAUD DETECTION by Tom Fawcett and Foster Provost Tom Fawcett Foster ProvostTom Fawcett Foster Provost Presented by: Eric DeWind.

ADAPTIVE FRAUD DETECTION by Tom Fawcett and Foster Provost Tom Fawcett Foster ProvostTom Fawcett Foster Provost Presented by: Eric DeWind

Outline ■Problem Description –Cellular cloning fraud problem –Why it is important –Current strategies ■Construction of Fraud Detector –Framework –Rule learning, Monitor construction, Evidence combination ■Experiments and Evaluation –Data used in this study –Data preprocessing –Comparative results ■Conclusion ■Exam Questions 2

The Problem ■How to detect suspicious changes in user behavior to identify and prevent cellular fraud –Non-legitimate users, aka bandits, gain illicit access to a legitimate user’s, or victim’s, account ■Solution useful in other contexts –Identifying and preventing credit card fraud, toll fraud, and computer intrusion 3

Cellular Fraud - Cloning ■ Cloning Fraud –A kind of Superimposition fraud (parasite) –Fraudulent usage is superimposed upon ( added to ) the legitimate usage of an account –Causes inconvenience to customers and great expense to cellular service providers 4

Cellular communications and Cloning Fraud ■Mobile Identification Number (MIN) and Electronic Serial Number (ESN) –Identify a specific account –Periodically transmitted unencrypted whenever phone is on ■Bandits use MIN and ESN to fake a customer’s account –Bandit can make virtually unlimited, untraceable calls at someone else’s expense 5

Interest in reducing Cloning Fraud ■Fraud is detrimental in several ways: –Fraudulent usage congests cell sites –Fraud incurs land-line usage charges –Crediting process is costly to carrier and inconvenient to the customer 6

Strategies for dealing with cloning fraud ■Pre-call Methods –Identify and block fraudulent calls as they are made –Validate the phone or its user when a call is placed ■Post-call Methods –Identify fraud that has already occurred on an account so that further fraudulent usage can be blocked –Periodically analyze call data on each account to determine whether fraud has occurred. 7

Pre-call Methods ■Personal Identification Number (PIN) –PIN cracking is possible with more sophisticated equipment ■RF Fingerprinting –Method of identifying phones by their unique transmission characteristics ■Authentication –Reliable and secure private key encryption method –Requires special hardware capability –An estimated 30 million non-authenticatable phones are in use in the US alone (in 1997) 8

Post-call Methods ■Collision Detection –Analyze call data for temporally overlapping calls ■Velocity Checking –Analyze the locations and times of consecutive calls ■User Profiling 9

Another Post-call Method ( Main focus of this paper ) ■User Profiling –Analyze calling behavior to detect usage anomalies suggestive of fraud –Works well with low-usage customers –Good complement to collision and velocity checking because it covers cases the others might miss 10

Sample Frauded Account DateTimeDayDurationOriginDestinationFraud 1/01/9510:05:01Mon13 minutesBrooklyn, NYStamford, CT 1/05/9514:53:27Fri 5 minutesBrooklyn, NYGreenwich, CT 1/08/9509:42:01Mon 3 minutesBronx, NYManhattan, NY 1/08/9515:01:24Mon 9 minutesBrooklyn, NY 1/09/9515:06:09Tue 5 minutesManhattan, NYStamford, CT 1/09/9516:28:50Tue53 secondsBrooklyn, NY 1/10/9501:45:36Wed35 secondsBoston, MAChelsea, MABandit 1/10/9501:46:29Wed34 secondsBoston, MAYonkers, NYBandit 1/10/9501:50:54Wed39 secondsBoston, MAChelsea, MABandit 1/10/9511:23:28Wed24 secondsBrooklyn, NYCongers, NY 1/11/9522:00:28Thu37 secondsBoston, MA Bandit 1/11/9522:04:01Thu37 secondsBoston, MA Bandit 11

The Need to be Adaptive ■Patterns of fraud are dynamic – bandits constantly change their strategies in response to new detection techniques ■Levels of fraud can change dramatically from month-to-month ■Cost of missing fraud or dealing with false alarms change with inter-carrier contracts 12

AUTOMATIC CONSTRUCTION OF PROFILING FRAUD DETECTORS

One Approach ■Build a fraud detection system by classifying calls as being fraudulent or legitimate ■However there are two problems that make simple classification techniques infeasible. 14

Problems with simple classification ■Context –A call that would be unusual for one customer may be typical for another customer ■Granularity –At the level of the individual call, the variation in calling behavior is large, even for a particular user 15

In Summary: Learning The Problem 1) Which phone call features are important? 2) How should profiles be created? 3) When should alarms be raised? 16

Proposed Detector Constructor Framework (DC-1) 17

DC-1 Processing Account-Day Example 18

DC-1 Fraud Detection Stages ■ Stage 1: Rule Learning ■ Stage 2: Profile Monitoring ■ Stage 3: Combining Evidence 19

Rule Learning – the 1 st stage ■Rule Generation –Rules are generated locally based on differences between fraudulent and normal behavior for each account ■Rule Selection –Then they are combined in a rule selection step 20

Rule Generation ■DC-1 uses the RL program to generate rules with certainty factors above user-defined threshold ■For each Account, RL generates a “local” set of rules describing the fraud on that account. ■Example: (Time-of-Day = Night) AND (Location = Bronx)  FRAUD Certainty Factor = 0.89 21

Rule Selection ■Rule generation step typically yields tens of thousands of rules ■If a rule is found in (or covers) many accounts then it is probably worth using –T accts := minimum # of accounts the rule must cover ■An account is considered until it is deemed covered –T rules := # of rules to cover an account ■Selection algorithm identifies a small set of general rules that cover the accounts ■Resulting set of rules is used to construct specific monitors 22

Profiling Monitors – the 2 nd stage Monitors have 2 distinct steps - ■Profiling step: –Monitor is applied to an account’s normal usage to measure the account‘s normal activity –Statistics are saved with the account. –Use Monitor Templates ■Use step: –A monitor processes a single account-day –References the normalcy measure from profiling –Generates a numeric value describing how abnormal the current account-day is 23

Most Common Monitor Templates ■ Threshold ■ Standard Deviation 24

Threshold Monitors 25

Standard Deviation Monitors 26

Example for Standard Deviation ■Rule –(TIMEOFDAY = NIGHT) AND (LOCATION = BRONX)  FRAUD ■Profiling Step –the subscriber called from the Bronx an average of 5 minutes per night with a standard deviation of 2 minutes. At the end of the Profiling step, the monitor would store the values (5,2) with that account. ■Use step –if the monitor processed a day containing 3 minutes of airtime from the Bronx at night, the monitor would emit a zero; if the monitor saw 15 minutes, it would emit (15 - 5)/2 = 5. This value denotes that the account is five standard deviations above its average (profiled) usage level 27

Comparing the same standard deviation monitor on two accounts 28

Combining Evidence from the Monitors – the 3 rd stage Combining Evidence from the Monitors – the 3 rd stage ■Weights the monitor outputs and learns a threshold on the sum to produce high confidence alarms ■DC-1 uses Linear Threshold Unit (LTU) –Simple and fast –Enables good first-order judgment ■A Feature selection process is used to –Choose a small set of useful monitors in the final detector –Some rules don’t perform well when used in monitors, some overlap –Forward selection process chooses set of useful monitors 29

Final Output of DC-1 ■Detector that profiles each user’s behavior based on several indicators ■An alarm when sufficient evidence of fraudulent activity 30

DATA USED IN THE STUDY

Data Information Data Information ■Four months of phone call records from the New York City area ■Each call is described by 31 original attributes ■Some derived attributes are added –Time-Of-Day (MORNING, AFTERNOON, TWILIGHT, EVENING, NIGHT) –To-Payphone ■Calls labeled as fraudulent using block crediting 32

Data Cleaning ■Eliminated calls that were credited outside of the range of fraudulent call times –Calls were clearly legitimate and erroneously marked fraudulent. ■Days with 1-4 minutes of fraudulent usage were discarded. –May have credited for other reasons, such as wrong number ■Call times were normalized to Greenwich Mean Time for chronological sorting 33

Data Description ■After monitor creation, data is separated into “Account Days” ■Selected for Rule Learning, Account Profiling, training and testing: –Rule Learning: ■879 accounts, 500,000 calls. –Account Profiling, training and testing: ■3600 accounts that have at least 30 fraud-free days of usage before any fraudulent usage ■Initial 30 days of each account were used for profiling ■Remaining days were used to generate 96,000 account-days ■Distinct training and testing accounts:10,000 account-days for training; 5000 for testing ■20% fraud days and 80% non-fraud days 34

EXPERIMENTS AND EVALUATION

Output of DC-1 components ■Rule learning: 3630 rules –Each covering at least two accounts ■Rule selection: 99 rules ■2 monitor templates yielding 198 monitors ■Final feature selection: 11 monitors 36

The Importance Of Error Cost ■Classification accuracy is not sufficient to evaluate performance ■The costs of misclassification should be factored in ■Estimated Error Costs: –False positive(false alarm): $5 –False negative (letting a fraudulent account-day go undetected): $0.40 per minute of fraudulent air-time ■Factoring in error costs requires second training pass by LTU (Linear Threshold Unit) 37

Alternative Detection Methods ■Collisions + Velocities –Errors almost entirely due to false negatives ■High Usage – detect sudden large jump in account usage ■Best Individual DC-1 Monitor –(Time-of-day = Evening) ==> Fraud ■SOTA - State Of The Art –Incorporates 13 hand-crafted profiling methods –Best detectors identified in a previous study ■SOTA plus DC-1 –Combined SOTA and DC-1 monitors 38

DC-1 Vs. Alternatives DetectorAccuracy(%)Cost ($)Accuracy at Cost Alarm on all 202000020 Alarm on none 8018111 +/- 96180 Collisions + Velocities 82 +/- 0.317578 +/- 74982 +/- 0.4 High Usage 88+/- 0.76938 +/- 47085 +/- 1.7 Best DC-1 monitor 89 +/- 0.57940 +/- 31385 +/- 0.8 State of the art (SOTA) 90 +/- 0.46557 +/- 54188 +/- 0.9 DC-1 detector 92 +/- 0.55403 +/- 50791 +/- 0.8 SOTA plus DC-1 92 +/- 0.45078 +/- 31991 +/- 0.8 39

Shifting Fraud Distributions ■Fraud detection system should adapt to shifting fraud distributions To illustrate the above point - ■One non-adaptive DC-1 detector trained on a fixed distribution ( 80% non-fraud ) and tested against range of 75-99% non-fraud ■Another DC-1 was allowed to adapt (re-train its LTU threshold) for each fraud distribution ■Second detector was more cost effective than the first 40

DC-1 Component Contributions(1) ■High Usage Detector –Profiles with respect to undifferentiated account usage –Comparison with DC-1 demonstrates the benefit of using rule learning ■Best Individual DC-1 Monitor –Demonstrates the benefit of combining evidence from multiple monitors 42

DC-1 Component Contributions(2) ■Call Classifier Detectors –Represent rule learning without the benefit of account context –Demonstrates value of DC-1’s rule generation step, which preserves account context ■Shifting Fraud Distributions –Shows benefit of making evidence combination sensitive to fraud distribution 43

Conclusion ■ DC-1 uses a rule learning program to uncover indicators of fraudulent behavior from a large database of customer transactions ■ Then the indicators are used to create a set of monitors, which profile legitimate customer behavior and indicate anomalies ■ Finally, the outputs of the monitors are used as features in a system that learns to combine evidence to generate high confidence alarms 44

Conclusion ■Adaptability to dynamic patterns of fraud can be achieved by generating fraud detection systems automatically from data, using data mining techniques ■DC-1 can adapt to the changing conditions typical of fraud detection environments ■Experiments indicate that DC-1 performs better than other methods for detecting fraud 45

Exam Questions 46

Question 1 What are the two major fraud detection categories, differentiate them, and where does DC-1 fall under? What are the two major fraud detection categories, differentiate them, and where does DC-1 fall under? Pre Call Methods Pre Call Methods Involves validating the phone or its user when a call is placed Involves validating the phone or its user when a call is placed Post Call Methods – DC1 falls here Post Call Methods – DC1 falls here Analyzes call data on each account to determine whether cloning fraud has occurred Analyzes call data on each account to determine whether cloning fraud has occurred 47

Question 2 Why do fraud detection methods need to be adaptive? Why do fraud detection methods need to be adaptive? Bandits change their behavior- patterns of fraud dynamic Bandits change their behavior- patterns of fraud dynamic Levels of fraud varies month-to-month Levels of fraud varies month-to-month Cost of missing fraud or handling false alarms changes between inter-carrier contracts Cost of missing fraud or handling false alarms changes between inter-carrier contracts 48

Question 3 What are the two steps of profiling monitors and and what are the two main monitor templates? What are the two steps of profiling monitors and and what are the two main monitor templates? Profiling Step: measure an accounts normal activity and save statistics Profiling Step: measure an accounts normal activity and save statistics Use Step: process usage for an account-day to produce a numerical output describing how abnormal activity was on that account-day Use Step: process usage for an account-day to produce a numerical output describing how abnormal activity was on that account-day Threshold and Standard Deviation monitors Threshold and Standard Deviation monitors 49

Questions? 50

ADAPTIVE FRAUD DETECTION by Tom Fawcett and Foster Provost Tom Fawcett Foster ProvostTom Fawcett Foster Provost Presented by: Eric DeWind.

Similar presentations

Presentation on theme: "ADAPTIVE FRAUD DETECTION by Tom Fawcett and Foster Provost Tom Fawcett Foster ProvostTom Fawcett Foster Provost Presented by: Eric DeWind."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ADAPTIVE FRAUD DETECTION by Tom Fawcett and Foster Provost Tom Fawcett Foster ProvostTom Fawcett Foster Provost Presented by: Eric DeWind.

Similar presentations

Presentation on theme: "ADAPTIVE FRAUD DETECTION by Tom Fawcett and Foster Provost Tom Fawcett Foster ProvostTom Fawcett Foster Provost Presented by: Eric DeWind."— Presentation transcript:

Similar presentations

About project

Feedback