Review of Fraud Classification Using Principal Components Analysis of RIDITS By Louise A. Francis Francis Analytics and Actuarial Data Mining, Inc.

Slides:



Advertisements
Similar presentations
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Advertisements

Regression With Categorical Variables. Overview Regression with Categorical Predictors Logistic Regression.
Chapter 7 – K-Nearest-Neighbor
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Distinguishing the Forest from the Trees University of Texas November 11, 2009 Richard Derrig, PhD, Opal Consulting Louise Francis,
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
19-1 Chapter Nineteen MULTIVARIATE ANALYSIS: An Overview.
Segmentation and Profiling using SPSS for Windows Kate Grayson.
Basics: Notation: Sum:. PARAMETERS MEAN: Sample Variance: Standard Deviation: * the statistical average * the central tendency * the spread of the values.
Goals of Factor Analysis (1) (1)to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify.
CLUSTERING (Segmentation)
Dr. Michael R. Hyman Cluster Analysis. 2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that.
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Decision Tree Models in Data Mining
BASIC STATISTICS WE MOST OFTEN USE Student Affairs Assessment Council Portland State University June 2012.
Comparison of Classification Methods for Customer Attrition Analysis Xiaohua Hu, Ph.D. Drexel University Philadelphia, PA, 19104
Title: Spatial Data Mining in Geo-Business. Overview  Twisting the Perspective of Map Surfaces — describes the character of spatial distributions through.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Data Presentation.
Cluster Analysis Forming Groups within the Sample of Respondents.
LECTURE UNIT 7 Understanding Relationships Among Variables Scatterplots and correlation Fitting a straight line to bivariate data.
Chapter Eleven A Primer for Descriptive Statistics.
Free and Cheap Sources of External Data CAS 2007 Predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Multivariate Data Analysis CHAPTER seventeen.
Research Methods Chapter 8 Data Analysis. Two Types of Statistics Descriptive –Allows you to describe relationships between variables Inferential –Allows.
CAS Spring Meeting Commentary on the New Hazard Groups June 18, 2007 Jose Couret Orlando.
Data Mining – Best Practices Part #2 Richard Derrig, PhD, Opal Consulting LLC CAS Spring Meeting June 16-18, 2008.
JOB EVALUATION MAGNETIC CONTACTORS.
Agenda Descriptive Statistics Measures of Spread - Variability.
Predictive Modeling CAS Reinsurance Seminar May 7, 2007 Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining,
1 Data Mining: Data Lecture Notes for Chapter 2. 2 What is Data? l Collection of data objects and their attributes l An attribute is a property or characteristic.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Explain the difference between dependence and interdependence.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Dimension Reduction in Workers Compensation CAS predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Predictive Modeling Spring 2005 CAMAR meeting Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Data Mining Techniques Clustering. Purpose In clustering analysis, there is no pre-classified data Instead, clustering analysis is a process where a set.
Chapter 6: Analyzing and Interpreting Quantitative Data
Descriptive Statistics. My immediate family includes my wife Barbara, my sons Adam and Devon, and myself. I am 62, Barbara is 61, and the boys are both.
Cluster Analysis.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
DATA ANALYSIS Indawan Syahri.
Chapter 5 – Evaluating Predictive Performance Data Mining for Business Analytics Shmueli, Patel & Bruce.
Principal Component Analysis
BIVARIATE/MULTIVARIATE DESCRIPTIVE STATISTICS Displaying and analyzing the relationship between continuous variables.
Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.
Distinguishing the Forest from the Trees 2006 CAS Ratemaking Seminar Richard Derrig, PhD, Opal Consulting Louise Francis, FCAS,
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
Nearest Neighbour and Clustering. Nearest Neighbour and clustering Clustering and nearest neighbour prediction technique was one of the oldest techniques.
ScWk 298 Quantitative Review Session
Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.
Validating the PRIDIT method for determining hospital quality with outcomes data Robert Lieberthal, PhD, Dominique Comer, PharmD, Katherine O’Connell,
Variable Reduction for Predictive Modeling with Clustering
Dimension Reduction in Workers Compensation
Principal Component Analysis (PCA)
Chapter 7 – K-Nearest-Neighbor
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Daniela Stan Raicu School of CTI, DePaul University
Exam #3 Review Zuyin (Alvin) Zheng.
Daniela Stan Raicu School of CTI, DePaul University
Gathering and Organizing Data
Psychology Statistics
Descriptive Statistics vs. Factor Analysis
Multivariate Statistics
Group 9 – Data Mining: Data
Gathering and Organizing Data
Presentation transcript:

Review of Fraud Classification Using Principal Components Analysis of RIDITS By Louise A. Francis Francis Analytics and Actuarial Data Mining, Inc.

Objectives ßAddress question: Why use new method, PRIDIT? ßIntroduce other methods used in similar circumstances ßExplain how PRIDIT adds to methods available ßExplain limitations of PRIDIT/RIDIT

A Key Problem in Fraud Modeling ßMost data mining methods need a target (dependent) variable ßY = a + b 1 x 1 + b 2 x 2 + … b n x n ßFraud (Yes/No or Fraud Score) = f(predictor variables) ßNeed sample of data where claims have been determined to be fraudulent or legitimate

Dependent variable hard to get ßIn a large sample of automobile insurance claims perhaps 1/3 may have an element of abuse or fraud ßScarce resources are not expensed on such large volumes of claims to determine their legitimacy ßOnly a small percentage referred to SIU investigators or other investigations ßThere are time lags in determining the outcome of investigations

Unsupervised learning ßAnother approach that does not require a dependent variable ßTwo Key Kinds ßCluster Analysis ßPrincipal Components/Factor Analysis ßPridit uses this approach ßIt is applied to ordered categorical variables

Cluster Analysis ßRecords are grouped in categories that have similar values on the variables ßExamples ßMarketing: People with similar values on demographic variables (i.e., age, gender, income) may be grouped together for marketing ßText analysis: Use words that tend to occur together to classify documents ßNote: no dependent variable used in analysis

Clustering ßCommon Method: k-means, hierarchical ßNo dependent variable – records are grouped into classes with similar values on the variable ßStart with a measure of similarity or dissimilarity ßMaximize dissimilarity between members of different clusters

Dissimilarity (Distance) Measure – Continuous Variables ßEuclidian Distance ßManhattan Distance

Binary Variables

ßSample Matching ßRogers and Tanimoto

Example: Fraud Data ßData from 1993 closed claim study conducted by Automobile Insurers Bureau of Massachusetts ßClaim files often have variables which may be useful in assessing suspicion of fraud, but a dependent variable is often not available ßVariables used for clustering: ßLegal representation ßPrior Claim ßSIU Investigation ßAt fault ßPolice report ßNumber of providers

Statistics for Clusters ßBased on descriptive statistics, Cluster 2 appears to have higher likelihood of fraudulent claims – more about this later

Principal Components Analysis ßA form of dimension (variable) reduction ßSuppose we want to combine all the information related to the “financial” dimension of fraud ßMedical provider bill (indicative of padding claim) ßHospital bill ßNumber of providers ßEconomic Losses ßClaimed wages ßIncurred Losses

Principal Components ßThese variables are correlated but not perfectly correlated ßWe replace many variables with a weighted sum of the variables

Correlation Matrix for Variables

Finding Factor or Component ßThe correlation matrix is used to find the factor that explains the most variance (captures most of the correlation) for the set of variables ßThat component or factor extracted will be a weighted average of the variables ßMore than one Component or Factor may result from applying the method

Evaluating Importance of Variables ßUse factor loadings

Problem: Categorical Variables ßIt is not clear how to best perform Principal Components/Factor Analysis on categorical variables ßThe categories may be coded as a series of binary dummy variables ßIf the categories are ordered categories, you may loose important information ßThis is the problem that PRIDIT addresses

RIDIT ßVariables are ordered so that lowest value is associated with highest probability of fraud ßUse Cumulative distribution of claims at each value, i, to create RIDIT statistic for claim t, value i

Example: RIDIT for Legal Representation

PRIDIT ßUse RIDIT statistics in Principal Components Analysis

Scoring ßAssign a score to each claim ßThe score can be used to sort claims ßMore effort expended on claims more likely to be fraudulent or abusive ßIn the case of AIB data, we can use additional information to test how well PRIDIT did, using the PRIDIT score ßA suspicion score was assigned to each claim by an expert

PRIDIT vs. Suspicion Score

Clustering and Suspicion Score

Result ßThere appears to be a strong relationship between PRIDIT score and suspicion that claim is fraudulent or abusive ßThe clusters resulting from the cluster procedure also appeared to be effective in separating legitimate from fraudulent or abusive claims

Comparison: PRIDIT and Clustering ßPRIDIT gives a score, which may be very useful for claims sorting. Clustering assigns claims to classes. They are either in or out of the assigned class. ßClustering ignores information about the order of values for categorical variables ßClustering can accommodate both categorical and continuous variables

Comparison ßUnordered categorical variables with many values (i.e., injury type): ßClustering has a procedure for measuring dissimilarity for these variables and can use them in clustering ßIf the values for the variables contain no meaningful order, PRIDIT will not help in creating variables to use in Principal Components Analysis.

Review of Fraud Classification Using Principal Components Analysis of RIDITS By Louise A. Francis Francis Analytics and Actuarial Data Mining, Inc.