If They Cheat, Can We Catch Them With Predictive Modeling Richard A. Derrig, PhD, CFE President Opal Consulting, LLC Senior Consultant Insurance Fraud.

If They Cheat, Can We Catch Them With Predictive Modeling Richard A. Derrig, PhD, CFE President Opal Consulting, LLC Senior Consultant Insurance Fraud Bureau of Massachusetts CAS Predictive Modeling October 11-12, 2007

Insurance Fraud- The Problem  ISO/IRC 2001 Study: Auto and Workers Compensation Fraud a Big Problem by 27% of Insurers.  CAIF: Estimation (too large)  Mass IFB: 1,500 referrals annually for Auto, WC, and Other P-L lines.

Fraud Definition PRINCIPLES  Clear and willful act  Proscribed by law  Obtaining money or value  Under false pretenses Abuse, Unethical:Fails one or more Principles

HOW MUCH CLAIM FRAUD? (CRIMINAL or CIVIL?)

10% Fraud

REAL PROBLEM-CLAIM FRAUD  Classify all claims  Identify valid classes Pay the claim No hassle Visa Example  Identify (possible) fraud Investigation needed  Identify “gray” classes Minimize with “learning” algorithms

Company Automation - Data Mining  Data Mining/Predictive Modeling Automates Record Reviews  No Data Mining without Good Clean Data (90% of the solution)  Insurance Policy and Claim Data; Business and Demographic Data  Data Warehouse/Data Mart  Data Manipulation – Simple First; Complex Algorithms When Needed

Computers advance

 Experience and Judgment  Artificial Intelligence Systems Regression & Tree Models Fuzzy Clusters Neural Networks Expert Systems Genetic Algorithms All of the Above FRAUD IDENTIFICATION

MATHEMATICAL MODELS  Databases Vector Spaces Topological Spaces Stochastic Processes  Scoring Mappings to R Linear Functionals

DM Databases Scoring Functions Graded Output Non-Suspicious Claims Routine Claims Suspicious Claims Complicated Claims

DM Databases Scoring Functions Graded Output Non-Suspicious Risks Routine Underwriting Suspicious Risks Non-Routine Underwriting

POTENTIAL VALUE OF A PREDICTIVE MODELING SCORING SYSTEM  Screening to Detect Fraud Early  Auditing of Closed Claims to Measure Fraud, Both Kinds  Sorting to Select Efficiently among Special Investigative Unit Referrals  Providing Evidence to Support a Denial  Protecting against Bad-Faith

PREDICTIVE MODELING SOME PUBLIC TECHNIQUES  Fuzzy Logic and Controllers  Regression Scoring Systems  Unsupervised Techniques: Kohonen and PRIDIT  EM Algorithm (Medical Bills)  Tree-based Methods

FUZZY SETS COMPARED WITH PROBABILITY  Probability: Measures randomness; Measures whether or not event occurs; and Randomness dissipates over time or with further knowledge.  Fuzziness: Measures vagueness in language; Measures extent to which event occurs; and Vagueness does not dissipate with time or further knowledge.

Fuzzy Clustering & Detection: k-Means Clustering  Fuzzy Logic: True, False, Uncertain  Fuzzy Numbers: Membership Value  Fuzzy Clusters: Partial Membership  App1: Suspicion of Fraud  App2: Town (Zip) Rating Classes  REF: Derrig-Ostazewski 1995

FUZZY SETS TOWN RATING CLASSIFICATION  When is one Town near another for Auto Insurance Rating? -Geographic Proximity (Traditional) -Overall Cost Index (Massachusetts) - Geo-Smoothing (Experimental)  Geographically close Towns do not have the same Expected Losses.  Clusters by Cost Produce Border Problems: Nearby Towns Different Rating Territories. Fuzzy Clusters acknowledge the Borders.  Are all coverage clusters correct for each Insurance Coverage?  Fuzzy Clustering on Five Auto Coverage Indices is better and demonstrates a weakness in Overall Crisp Clustering.

Fuzzy Clustering of Fraud Study Claims by Assessment Data Membership Value Cut at 0.2

AIB FRAUD INDICATORS  Accident Characteristics (19) No report by police officer at scene No witnesses to accident  Claimant Characteristics (11) Retained an attorney very quickly Had a history of previous claims  Insured Driver Characteristics(8) Had a history of previous claims Gave address as hotel or P.O. Box

Supervised Models Regression: Fraud Indicators  Fraud Indicators Serve as Independent Dummy Variables  Expert Evaluation Categories Serve as Dependent Target  Regression Scoring Systems  REF1: Weisberg-Derrig, 1998  REF2: Viaene et al., 2002

Unsupervised Models Kohonen Self-Organizing Features  Fraud Indicators Serve as Independent “Features”  Expert Evaluation Categories Can Serve as Dependent Target in Second Phase  Self-Organizing Feature Maps T. Kohonen 1982-1990 (Cybernetics) Reference vectors map to OUTPUT format in topologically faithful way. Example: Map onto 40x40 2-dimensional square. Iterative Process Adjusts All Reference Vectors in a “Neighborhood” of the Nearest One. Neighborhood Size Shrinks over Iterations

MAPPING: PATTERNS-TO-UNITS Patterns

KOHONEN FEATURE MAP SUSPICION LEVELS

FEATURE MAP SIMILIARITY OF A CLAIM

DATA MODELING EXAMPLE: CLUSTERING  Data on 16,000 Medicaid providers analyzed by unsupervised neural net  Neural network clustered Medicaid providers based on 100+ features  Investigators validated a small set of known fraudulent providers  Visualization tool displays clustering, showing known fraud and abuse  Subset of 100 providers with similar patterns investigated: Hit rate > 70% Cube size proportional to annual Medicaid revenues © 1999 Intelligent Technologies Corporation

SELF ORGANIZING MAP  Binary Features  d(m,0) = Suspicion Level  Given c = {features}  c → m c “Guilt by Association”

PRIDIT Unique Embedded Score  Data: Features have no natural metric-scale  Model: Stochastic process has no parametric form  Classification: Inverse image of one dimensional scoring function and decision rule  Feature Value: Identify which features are “important”

PRIDIT METHOD OVERVIEW  1. DATA: N Claims, T Features, K sub T Responses, Monotone In “Fraud”  2. RIDIT score each possible response: proportion below minus proportion above, score centered at zero.  3. RESPONSE WEIGHTS: Principal Component of Claims x Features with RIDIT in Cells (SPSS, SAS or S Plus software)  4. SCORE: Sum weights x claim Ridit score.  5. PARTITION: above and below zero.

PRIDIT METHOD RESULTS  1. DATA: N Claims Clustered in Varying Degrees of “Fraud”  2. FEATURES: Each Feature Has Importance Weight  3. CLUSTERS: Size Can Change With Experience  4. SCORE: Can Be Checked On New Database  5. DECISIONS: Scores Near Zero = Too Little Information.

TABLE 1 Computation of RIDIT Scores Variable Label Proportion of “Yes” B t1 ("Yes") B t2 (“No") Large # of Visits to ChiropractorTRT144%-.56.44 Chiropractor provided 3 or more modalities on most visits TRT212%-.88.12 Large # of visits to a physical therapist TRT38%-.92.08 MRI or CT scan but no inpatient hospital charges TRT420%-.80.20 Use of “high volume” medical provider TRT531%-.69.31 Significant gaps in course of treatment TRT69%-.91.09 Treatment was unusually prolonged (> 6 months) TRT724%-.76.24 Indep. Medical examiner questioned extent of treatment TRT811%-.89.11 Medical audit raised questions about charges TRT94%-.96.04

TABLE 3 PRIDIT Transformed Indicators, Scores and Classes ClaimTRT 1 TRT 2 TRT 3 TRT 4 TRT 5 TRT 6 TRT 7 TRT 8 TRT 9 ScoreClass 1 0.440.120.080.20.310.090.240.110.04.072 2 0.440.120.080.2-0.690.090.240.110.04.072 3 0.44-0.88-0.920.20.31-0.91-0.760.110.04-.251 4 -0.560.120.080.20.310.090.240.110.04.042 5 -0.56-0.880.080.20.310.090.240.110.04.022 6 0.440.120.080.20.310.090.240.110.04.072 7 -0.560.120.080.20.310.09-0.76-0.890.04-.101 8 -0.440.120.080.2-0.690.090.240.110.04.022 9 -0.56-0.880.08-0.80.310.090.240.11-0.96.052 10 -0.560.120.080.20.310.090.240.110.04.042

TABLE 2 Weights for Treatment Variables Variable PRIDIT Weights W (∞) Regression Weights TRT1.30.32*** TRT2.19.19*** TRT3.53.22*** TRT4.38.07 TRT5.02.08* TRT6.70-.01 TRT7.82.03 TRT8.37.18*** TRT9-.13.24** Regression significance shown at 1% (***), 5% (**) or 10% (*) levels.

TABLE 3 PRIDIT Transformed Indicators, Scores and Classes ClaimTRT 1 TRT 2 TRT 3 TRT 4 TRT 5 TRT 6 TRT 7 TRT 8 TRT 9 ScoreClass 1 0.440.120.080.20.310.090.240.110.04.072 2 0.440.120.080.2-0.690.090.240.110.04.072 3 0.44-0.88-0.920.20.31-0.91-0.760.110.04-.251 4 -0.560.120.080.20.310.090.240.110.04.042 5 -0.56-0.880.080.20.310.090.240.110.04.022 6 0.440.120.080.20.310.090.240.110.04.072 7 -0.560.120.080.20.310.09-0.76-0.890.04-.101 8 -0.440.120.080.2-0.690.090.240.110.04.022 9 -0.56-0.880.08-0.80.310.090.240.11-0.96.052 10 -0.560.120.080.20.310.090.240.110.04.042

TABLE 7 AIB Fraud and Suspicion Score Data Top 10 Fraud Indicators by Weight PRIDITAdj. Reg. ScoreInv. Reg. Score ACC3ACC1ACC11 ACC4ACC9CLT4 ACC15ACC10CLT7 CLT11ACC19CLT11 INJ1CLT11INJ1 INJ2INS6INJ3 INJ5INJ2INJ8 INJ6INJ9INJ11 INS8TRT1 LW6TRT9

EM Algorithm Hidden Exposures - Overview  Modeling hidden risk exposures as additional dimension(s) of the loss severity distribution via EM, Expectation-Maximization, Algorithm  Considering the mixtures of probability distributions as the model for losses affected by hidden exposures with some parameters of the mixtures considered missing (i.e., unobservable in practice)  Approach is feasible due to advancements in the computer driven methodologies dealing with partially hidden or incomplete data models  Empirical data imputation has become more sophisticated and the availability of ever faster computing power have made it increasingly possible to solve these problems via iterative algorithms

Figure 1: Overall distribution of the 348 BI medical bill amounts from Appendix B compared with that submitted by provider A. Left panel: frequency histograms (provider A’s histogram in filled bars). Right panel: density estimators (provider A’s density in dashed line) Source: Modeling Hidden Exposures in Claim Severity via the EM Algorithm, Grzegorz A. Rempala, Richard A. Derrig, pg. 9, 11/18/02

Figure 2: EM Fit Left panel: mixture of normal distributions fitted via the EM algorithm to BI data Right panel: Three normal components of the mixture. Source: Modeling Hidden Exposures in Claim Severity via the EM Algorithm, Grzegorz A. Rempala, Richard A. Derrig, pg. 13, 11/18/02

Decision Trees  In decision theory (for example risk management), a decision tree is a graph of decisions and their possible consequences, (including resource costs and risks) used to create a plan to reach a goal. Decision trees are constructed in order to help with making decisions. A decision tree is a special form of tree structure.  www.wikipedia.org

Different Kinds of Decision Trees  Single Trees (CART, CHAID)  Ensemble Trees, a more recent development (TREENET, RANDOM FOREST) A composite or weighted average of many trees (perhaps 100 or more) There are many methods to fit the trees and prevent overfitting  Boosting: Iminer Ensemble and Treenet  Bagging: Random Forest

The Methods and Software Evaluated 1) TREENET5) Iminer Ensemble 2) Iminer Tree6) Random Forest 3) SPLUS Tree7) Naïve Bayes (Baseline) 4) CART8) Logistic (Baseline)

Ensemble Prediction of IME Requested

TREENET ROC Curve – IME AUROC = 0.701

Ranking of Methods/Software – 1 st Two Surrogates

Implementation Outline Included at End

NON-CRIMINAL FRAUD?

 General Deterrence – Ins. System, Ins Dept + DIA, Med, and Other Government Oversight  Specific Deterrence – Company SIU, Auditor, Data, Predictive Modeling for Claims and Underwriting. Non-Criminal Fraud

FRAUD INDICATORS VALIDATION PROCEDURES  Canadian Coalition Against Insurance Fraud (1997) 305 Fraud Indicators (45 vehicle theft)  “No one indicator by itself is necessarily suspicious”.  Problem: How to validate the systematic use of Fraud Indicators?  Solution 1: Kohonen Self-Organizing Feature Map (Brockett et.al, 1998  Solution 2: Logistic Regression (Viaene, et.al, 2002)  Solution 3: PRIDIT Method (Brockett et.al, 2002)  Solution 4: Regression Tree Methods (Derrig, Francis, 2007)

REFERENCES  Brockett, Patrick L., Derrig, Richard A., Golden, Linda L., Levine, Albert and Alpert, Mark, (2002), Fraud Classification Using Principal Component Analysis of RIDITs, Journal of Risk and Insurance, 69:3, 341-373.  Brockett, Patrick L., Xiaohua, Xia and Derrig, Richard A., (1998), Using Kohonen’ Self- Organizing Feature Map to Uncover Automobile Bodily Injury Claims Fraud, Journal of Risk and Insurance, 65:245-274  Derrig, Richard A., (2002), Insurance Fraud, Journal of Risk and Insurance, 69:3, 271-289.  Derrig, Richard A., and Ostaszewski, K., (1995) Fuzzy Techniques of Pattern Recognition in Risk & Claim Classification, 62:3, 147-182.  Francis, Louise and Derrig, Richard A., (2007) Distinguishing the Forest from the TREES: Working Paper.  Rempala, G.A., and Derrig, Richard A., (2003), Modeling Hidden Exposures in Claim Severity via the EM Algorithm, North American Actuarial Journal, 9(2), 108-128.  Viaene, S., Derrig, Richard A. et. al, (2002) A Comparison of State-of-the-Art Classification Techniques for Expert Automobile Insurance Fraud Detection, Journal of Risk and Insurance, 69:3, 373-423.  Weisberg, H.I. and Derrig R.A., (1998), Quantitative Methods for Detecting Fraudulent Automobile Bodily Injury Claims, RISQUES, vol. 35, pp. 75-101, July-September (In French, English available)

Claim Fraud Detection Plan  STEP 1:SAMPLE: Systematic benchmark of a random sample of claims.  STEP 2:FEATURES: Isolate red flags and other sorting characteristics  STEP 3:FEATURE SELECTION: Separate features into objective and subjective, early, middle and late arriving, acquisition cost levels, and other practical considerations.  STEP 4:CLUSTER: Apply unsupervised algorithms (Kohonen, PRIDIT, Fuzzy) to cluster claims, examine for needed homogeneity.

Claim Fraud Detection Plan  STEP 5:ASSESSMENT: Externally classify claims according to objectives for sorting.  STEP 6:MODEL: Supervised models relating selected features to objectives (logistic regression, Naïve Bayes, Neural Networks, CART, MARS)  STEP7:STATIC TESTING: Model output versus expert assessment, model output versus cluster homogeneity (PRIDIT scores) on one or more samples.  STEP 8:DYNAMIC TESTING: Real time operation of acceptable model, record outcomes, repeat steps 1-7 as needed to fine tune model and parameters. Use PRIDIT to show gain or loss of feature power and changing data patterns, tune investigative proportions to optimize detection and deterrence of fraud and abuse.

If They Cheat, Can We Catch Them With Predictive Modeling Richard A. Derrig, PhD, CFE President Opal Consulting, LLC Senior Consultant Insurance Fraud.

Similar presentations

Presentation on theme: "If They Cheat, Can We Catch Them With Predictive Modeling Richard A. Derrig, PhD, CFE President Opal Consulting, LLC Senior Consultant Insurance Fraud."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

If They Cheat, Can We Catch Them With Predictive Modeling Richard A. Derrig, PhD, CFE President Opal Consulting, LLC Senior Consultant Insurance Fraud.

Similar presentations

Presentation on theme: "If They Cheat, Can We Catch Them With Predictive Modeling Richard A. Derrig, PhD, CFE President Opal Consulting, LLC Senior Consultant Insurance Fraud."— Presentation transcript:

Similar presentations

About project

Feedback