Adaptive Fraud Detection

Adaptive Fraud Detection
by Tom Fawcett and Foster Provost Presented by: Yunfei Zhao (updated from last year’s presentation by Adam Boyer) From Data Mining and Knowledge Discovery

Outline Problem Description Construction of Fraud Detector
Cellular cloning fraud problem Why it is important Current strategies Construction of Fraud Detector Framework Rule learning, Monitor construction, Evidence combination Experiments and Evaluation Data used in this study Data preprocessing Comparative results Conclusion Exam Questions The outline of this presentation is as follows. We first describe the problem of cellular cloning fraud and some existing strategies for detecting it. We then describe the framework in detail using examples from the implemented system. Finally We present experimental results and evaluation of DC1. We will wrap up with a conclusion on DC1’s performance and the exam questions.

The Problem

Cellular Fraud - Cloning
Cloning Fraud A kind of Superimposition fraud. Fraudulent usage is superimposed upon ( added to ) the legitimate usage of an account. Causes inconvenience to customers and great expense to cellular service providers. Other Examples: Credit card fraud, Calling card fraud, some types of computer intrusion In the United States, according to data from 1997, cellular fraud cost the telecommunications industry hundreds of millions of dollars per year (Walters and Wilkinson 1994; Steward 1997). One kind of cellular fraud called cloning was particularly expensive and spreading fast in major cities throughout the United States. So What is cloning : …

Cellular communications and Cloning Fraud
Mobile Identification Number (MIN) and Electronic Serial Number (ESN) Identify a specific account Periodically transmitted unencrypted whenever phone is on Cloning occurs when a customer’s MIN and ESN are programmed into a cellular phone not belonging to the customer. Whenever a cellular phone is on, it periodically transmits two unique identification numbers: its Mobile Identification Number (MIN) and its Electronic Serial Number (ESN). These two numbers together specify the customer's account. These numbers are broadcast unencrypted over the airwaves, and they can be received, decoded and stored using special equipment that is relatively inexpensive. Cloning occurs when a customer's MIN and ESN are programmed into a cellular telephone not belonging to the customer. With the stolen MIN and ESN, a cloned phone user (which we call a bandit) can make virtually unlimited calls free of charge.

Interest in reducing Cloning Fraud
Fraud is detrimental in several ways: Fraudulent usage congests cell sites Fraud incurs land-line usage charges Cellular carriers must pay costs to other carriers for usage outside the home territory Crediting process is costly to carrier and inconvenient to the customer Certainly cloning fraud is very harmful, It is detrimental in several ways such as :

Strategies for dealing with cloning fraud
Pre-call Methods Identify and block fraudulent calls as they are made Validate the phone or its user when a call is placed Post-call Methods Identify fraud that has already occurred on an account so that further fraudulent usage can be blocked Periodically analyze call data on each account to determine whether fraud has occurred. There are two classes of methods for dealing with cloning fraud. Precall methods, which try to identify and block fraudulent calls as they are made and Post-call methods, which try to identify fraud that has already occurred on an account so that further fraudulent usage can be blocked.

Pre-call Methods Personal Identification Number (PIN)
PIN cracking is possible with more sophisticated equipment. RF Fingerprinting Method of identifying phones by their unique transmission characteristics Authentication Reliable and secure private key encryption method. Requires special hardware capability An estimated 30 million non-authenticatable phones are in use in the US alone (in 1997) Read Slides first. As long as cloning fraud will continue to be a problem the industry will have to rely on post-call fraud detection methods.

Post-call Methods Collision Detection Velocity Checking
Analyze call data for temporally overlapping calls Velocity Checking Analyze the locations and times of consecutive calls Disadvantage of the above methods Usefulness depends upon a moderate level of legitimate activity Since a MINESN pair is licensed to only one legitimate user, any simultaneous usage is probably fraudulent. A closely related method, velocity checking (Davis and Goyal 1993), involves analyzing the locations and times of consecutive calls to determine whether a single user could have placed them while traveling at reasonable speeds. For example, if a call is made in Los Angeles 20 minutes after a call is made on the same account in New York, two different people are likely using the account. If a specific customer rarely uses his or her cell phone, the odds of fraud being detected with this method is substantially less likely. For this reason, profiling is a good complement to collision and velocity checking because it covers cases the others might miss.

Another Post-call Method ( Main focus of this paper )
User Profiling Analyze calling behavior to detect usage anomalies suggestive of fraud Works well with low-usage customers Good complement to collision and velocity checking because it covers cases the others might miss

Sample Frauded Account
Date Time Day Duration Origin Destination Fraud 1/01/95 10:05:01 Mon 13 minutes Brooklyn, NY Stamford, CT 1/05/95 14:53:27 Fri 5 minutes Greenwich, CT 1/08/95 09:42:01 3 minutes Bronx, NY Manhattan, NY 15:01:24 9 minutes 1/09/95 15:06:09 Tue 16:28:50 53 seconds 1/10/95 01:45:36 Wed 35 seconds Boston, MA Chelsea, MA Bandit 01:46:29 34 seconds Yonkers, NY 01:50:54 39 seconds 11:23:28 24 seconds Congers, NY 1/11/95 22:00:28 Thu 37 seconds 22:04:01 This chart shows some chronological call data from an example. The column at the far right indicates whether the call is fraudulent or not. We can see from this call data that : 1. The legitimate user calls from the metro New York City area, usually during working hours, and typically makes calls lasting a few minutes. 2. The bandit's calls originate from a different area (Boston, Massachusetts, about 200 miles away), are made in the evenings, and last less than a minute.

The Need to be Adaptive Patterns of fraud are dynamic – bandits constantly change their strategies in response to new detection techniques Levels of fraud can change dramatically from month-to-month Cost of missing fraud and dealing with false alarms change with inter-carrier contracts One thing about our detection system is that it should be adaptive.

Automatic Construction of Profiling Fraud Detectors
Ideally, a fraud detection system should be able to learn such rules automatically and use them to catch fraud. This following part addresses the automatic design of user profiling methods.

One Approach Build a fraud detection system by classifying calls as being fraudulent or legitimate However there are two problems that make simple classification techniques infeasible. Ask the class if they can think of any ways that we couldn’t use straight forward classification.

Problems with simple classification
Context A call that would be unusual for one customer may be typical for another customer (For example, a call placed from Brooklyn is not unusual for a subscriber who lives there, but might be very strange for a Boston subscriber. ) Granularity At the level of the individual call, the variation in calling behavior is large, even for a particular user. In general, there are two problems that make simple classification approaches infeasible. So if we put all the data together and generate rules from that, we would lose context info. When looking at individual calls, the behavior of customers, i.e. the variation in calling behavior is large. It’s necessary to discover indicators corresponding to changes in behavior that are indicative of fraud, rather than absolute indicators of fraud ( just to say for every user what condition is wrong) and to profile the behavior of individual customers to characterize their normal behavior. Context information would comprise behavior information such as what the phone is used for, what areas it is used in, what areas/numbers it normally calls, what times of day, and so on. Context information (such as what areas/numbers it normally calls) is not available, so our solution is to derive it from historical data specific to each account. Legitimate subscribers occasionally make calls that look suspicious. So far there is no method to classify individual calls effectively, So it is necessary to aggregate customer behavior, smoothing out the variation, So in the experiments we aggregate customer behavior into account days.

The Learning Problem Which call features are important?
How should profiles be created? When should alarms be raised? In sum, the learning problem comprises three questions, each of which corresponds to a component of our framework.

Detector Constructor Framework
Under the framework, a system first learns rules that serve as indicators of fraudulent behavior. It then uses these rules, along with a set of templates, to create profiling monitors (M 1 through Mn ). These monitors profile the typical behavior of each account with respect to a rule and, in use, describe how far each account is from its typical behavior. Finally, the system learns to weight the monitor outputs so as to maximize the effectiveness of the resulting fraud detector. So In General , It is composed of three components .

Use of a detector ( DC-1 ) … Value normalization and weighting
Account-Day Day Time Duration Origin Destination Tue 1:42 10 min Bronx, NY Miami, FL 10:05 3 min Scarsdale, NY Bayonne, NJ 11:23 24 sec Congers, NY 14:53 5 min Tarrytown, NY Greenwich, CT 15:06 Manhattan, NY Westport, CT 16:28 53 sec 23:40 17 min Profiling Monitors # calls from BRONX at night exceeds daily threshold Airtime from BRONX at night SUNDY airtime exceeds daily threshold … 1 27 Value normalization and weighting shows how such a detector will be used. The monitors are provided with a single day's calls from a given account, and each monitor generates a number indicating how unusual that accountday looks for the account. The numeric outputs from the monitors are treated as evidence and are combined by the detector. When the detector has enough evidence of fraudulent activity on an account, based on the indications of the monitors, it generates an alarm. Evidence Combining S >=q Yes FRAUD ALARM

Rule Learning – the 1st stage
Rule Generation Rules are generated locally based on differences between fraudulent and normal behavior for each account Rule Selection Then they are combined in a rule selection step Now Lets discuss the first stage of DC1 Rule Learning In light of the need to maintain context information, rule learning is performed in two steps. Rules are first generated locally based on differences between fraudulent and normal behavior for each account, then they are combined in a rule selection step. That is one major difference between DC1 and the general Classification technique.

Rule Generation DC-1 uses the RL program to generate rules with certainty factors above user-defined threshold For each Account, RL generates a “local” set of rules describing the fraud on that account. Example: (Time-of-Day = Night) AND (Location = Bronx)  FRAUD Certainty Factor = 0.89 The input of the RL program is the call data are organized by account, with each call record labeled as fraudulent or legitimate. RL performs a general-to-specific search of the space of conjunctive rules. Please notice this rule is true within certain account , For other accounts this may not be true . This rule denotes that a call placed at night from The Bronx (a Borough of New York City) is likely to be fraudulent. The Certainty factor = 0.89 means that, within this account, a call matching this rule has an 89% probability of being fraudulent.

Rule Selection Rule generation step typically yields tens of thousands of rules If a rule is found in ( covers ) many accounts then it is probably worth using Selection algorithm identifies a small set of general rules that cover the accounts Resulting set of rules is used to construct specific monitors For example if there is accounts and for each account RL generate 50 rules then there would be rules after rule generation.

Profiling Monitors – the 2nd stage
Monitor has 2 distinct steps - Profiling step: Monitor is applied to an account’s non-fraud usage to measure account’s normal activity. Statistics are saved with the account. Use step: Monitor processes a single account-day, references the normality measure from profiling and generates a numeric value describing how abnormal the current account-day is.

Most Common Monitor Templates
Threshold Standard Deviation Profiling monitors are created by the monitor constructor, which employs a set of templates. The templates are instantiated by rule conditions. There are two common templates one is called Threshold and another is called Standard Deviation.

Threshold Monitors Such a monitor yields a binary feature corresponding to whether the user's behavior was above threshold for the given day.

Standard Deviation Monitors
Standard deviation monitors are sensitive both to the expected amount of activity on an account and to the expected daily variation of that activity. In other words, if two accounts have the same mean, but their standard deviation is different, one is very large, the other very small, the account with the smaller standard deviation, rather less variation in the profiling step will produce higher values from the monitor.

Example for Standard Deviation
Rule – (TIMEOFDAY = NIGHT) AND (LOCATION = BRONX) FRAUD Profiling Step - the subscriber called from the Bronx an average of 5 minutes per night with a standard deviation of 2 minutes. At the end of the Profiling step, the monitor would store the values (5,2) with that account. Use step - if the monitor processed a day containing 3 minutes of airtime from the Bronx at night, the monitor would emit a zero; if the monitor saw 15 minutes, it would emit (15 - 5)/2 = 5. This value denotes that the account is five standard deviations above its average (profiled) usage level. An example of standard deviation:

Combining Evidence from the Monitors – the 3rd stage
Train a classifier with attributes (monitor outputs) class label (fraudulent or legitimate) Weights the monitor outputs and learns a threshold on the sum to produce high confidence alarms DC-1 uses Linear Threshold Unit (LTU) Simple and fast Feature selection Choose a small set of useful monitors in the final detector The third stage of detector construction learns how to combine evidence from the monitors generated by the previous stage. For this stage, the outputs of the monitors are used as features to a standard learning program. Training is done on account data, and monitors evaluate one entire accountday at a time. In training, the monitors' outputs are presented along with the desired output (the accountday's correct class: fraud or nonfraud). The evidence combination weights the monitor outputs and learns a threshold on the sum so that alarms may be issued with high confidence.

Data used in the study Here describe the data we used to do our experiment .

Data Information 4 months of call records from the New York City area.
Each call is described by 31 original attributes Some derived attributes are added Time-Of-Day To-Payphone Each call is given a class label of fraudulent or legitimate. The call data used for this study are records of cellular calls placed over four months by users in the New York City area---an area with high levels of fraud.

Data Cleaning Eliminated credited calls made to numbers that are not in the created block The destination number must be only called by the legitimate user. Days with 1-4 minutes of fraudulent usage were discarded. Call times were normalized to Greenwich Mean Time for chronological sorting Cellular call data contains errors and noise from various sources. Carrier rep. and customer establish a range of dates during which fraud occurred and calls within this date range are considered fraudulent. The legitimate calls in this range are noise.

Data Selection Once the monitors are created and accounts profiled, the system transforms raw call data into a series of account-days using the monitor outputs as features Rule learning and selection 879 accounts comprising over 500,000 calls Profiling, training and testing 3600 accounts that have at least 30 fraud-free days of usage before any fraudulent usage. Initial 30 days of each account were used for profiling. Remaining days were used to generate 96,000 account-days. Distinct training and testing accounts ,10,000 account-days for training; for testing 20% fraud days and 80% non-fraud days The call data were separated carefully into several partitions for rule learning, account profiling, and detector training and testing.

Experiments and Evaluation

Output of DC-1 components
Rule learning: 3630 rules Each covering at least two accounts Rule selection: 99 rules 2 monitor templates yielding 198 monitors Final feature selection: 11 monitors

The Importance Of Error Cost
Classification accuracy is not sufficient to evaluate performance Should take misclassification costs into account Estimated Error Costs: False positive(false alarm): $5 False negative (letting a fraudulent account-day go undetected): $0.40 per minute of fraudulent air-time Factoring in error costs requires second training pass by LTU In order to evaluate DC1 ‘s performance, classification accuracy is not enough since different error has different cost A realistic evaluation must take into account misclassification costs. LTU minimizes error but not cost. The LTU’s threshold is adjusted to yield minimum error cost on the training set.

Alternative Detection Methods
Collisions + Velocities Errors almost entirely due to false positives High Usage – detect sudden large jump in account usage Best Individual DC-1 Monitor (Time-of-day = Evening) ==> Fraud SOTA - State Of The Art Incorporates 13 hand-crafted profiling methods Best detectors identified in a previous study

DC-1 Vs. Alternatives Detector Accuracy(%) Cost ($) Accuracy at Cost
Alarm on all 20 20000 Alarm on none 80 /- 961 Collisions + Velocities 82 +/- 0.3 /- 749 82 +/- 0.4 High Usage 88+/- 0.7 6938 +/- 470 85 +/- 1.7 Best DC-1 monitor 89 +/- 0.5 7940 +/- 313 85 +/- 0.8 State of the art (SOTA) 90 +/- 0.4 6557 +/- 541 88 +/- 0.9 DC-1 detector 92 +/- 0.5 5403 +/- 507 91 +/- 0.8 SOTA plus DC-1 92 +/- 0.4 5078 +/- 319 Alarm on all represents the policy of alarming on every account every day. The opposite strategy, Alarm on None, represents the policy of allowing fraud to go completely unchecked. We can consider these to be sort of costs bounds for the experiment. Interesting things to note: DC-1 beats the “State of the art” (SOTA) detector in all three measurements. Combining SOTA with DC-1 slightly increases the accuracy but decreases the costs substantially.

Shifting Fraud Distributions
Fraud detection system should adapt to shifting fraud distributions To illustrate the above point - One non-adaptive DC-1 detector trained on a fixed distribution ( 80% non-fraud ) and tested against range of 75-99% non-fraud Another DC-1 was allowed to adapt (re-train its LTU threshold) for each fraud distribution Second detector was more cost effective than the first Only an adaptive system will keep from losing its edge. In order to testify the importance of adaptability. Two different DC1 were constructed.

The results are shown in Figure 7
The results are shown in Figure 7. The Xaxis is the percentage of nonfraud accountdays, and the Yaxis is the cost per account day. This figure shows that the second detector, which is allowed to adjust itself to each new distribution, is consistently more cost effective than the fixed detector. This difference increases as the testing distribution becomes more skewed from the distribution upon which the fixed detector was trained.

DC-1 Component Contributions(1)
High Usage Detector Profiles with respect to undifferentiated account usage Comparison with DC-1 demonstrates the benefit of using rule learning Best Individual DC-1 Monitor Demonstrates the benefit of combining evidence from multiple monitors

DC-1 Component Contributions(2)
Call Classifier Detectors Represent rule learning without the benefit of account context Demonstrates value of DC-1’s rule generation step, which preserves account context Shifting Fraud Distributions Shows benefit of making evidence combination sensitive to fraud distribution

Conclusion DC-1 uses a rulelearning program to uncover indicators of fraudulent behavior from a large database of customer transactions. Then the indicators are used to create a set of monitors, which profile legitimate customer behavior and indicate anomalies. Finally, the outputs of the monitors are used as features in a system that learns to combine evidence to generate highconfidence alarms.

Conclusion Adaptability to dynamic patterns of fraud can be achieved by generating fraud detection systems automatically from data, using data mining techniques DC-1 can adapt to the changing conditions typical of fraud detection environments Experiments indicate that DC-1 performs better than other methods for detecting fraud

Exam Questions

Question 1 What are the two major fraud detection categories, differentiate them, and where does DC-1 fall under? Pre Call Methods Involves validating the phone or its user when a call is placed. Post Call Methods – DC1 falls here Analyzes call data on each account to determine whether cloning fraud has occurred.

Question 2 Three stages of DC1? Rule learning and selection
Profiling monitors Combine evidences from the monitors

Question 3 Profiling monitors have two distinct stages associated with them. Describe them. Profiling step: The monitor is applied to a segment of an account’s typical (non-fraud) usage in order to measure the account’s normal activity. Use step: The monitor processes a single account-day at a time, references the normalcy measure from the profiling step and generates a numeric value describing how abnormal the current account-day is.

The End Questions?

Adaptive Fraud Detection

Similar presentations

Presentation on theme: "Adaptive Fraud Detection"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Adaptive Fraud Detection

Similar presentations

Presentation on theme: "Adaptive Fraud Detection"— Presentation transcript:

Similar presentations

About project

Feedback