Disability Pricing and Dental Fraud Detection: Supervised and Unsupervised Learning October 11, 2007 Jonathan Polon FSA Claim Analytics Inc. www.claimanalytics.com.

Slides:

Advertisements

Similar presentations

Improving Disability Claims Management with Predictive Modeling May 15, 2008 Claim Analytics Inc. Barry Senensky FSA FCIA MAAA Jonathan Polon FSA

Advertisements

© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan,

Predictive Data Modeling A CASE STUDY FOR DATA MODELING.

Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters

Evaluating Inforce Blocks Of Disability Business With Predictive Modeling SOA Spring Health Meeting May 28, 2008 Jonathan Polon FSA

Dental Insurance Fraud Detection With Predictive Modeling SOA Spring Health Meeting May 29, 2008 Jonathan Polon FSA

Predictive Modeling for Disability Pricing May 13, 2009 Claim Analytics Inc. Barry Senensky FSA FCIA MAAA Jonathan Polon FSA

Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.

PwC An evidence-based overview of indicators for return-to-work John Walsh.

Chapter 17 Overview of Multivariate Analysis Methods

Chapter 14: Usability testing and field studies. 2 FJK User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept.

1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.

Continuous Audit at Insurance Companies

© Nancy E. Mayo 2004 Sample Size Estimations Demystifying Sample Size Calculations Graphics contributed by Dr. Gillian Bartlett.

BA 555 Practical Business Analysis

Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.

Manajemen Basis Data Pertemuan 8 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.

Principal Component Analysis Erek Dyskant Physiological Ecology Swarthmore College.

Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!

1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.

Part I: Classification and Bayesian Learning

Predictive Modeling and Dental Fraud Detection November 22, 2007 Jonathan Polon FSA Barry Senensky FSA FCIA MAAA Claim Analytics Inc.

CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.

Data Mining: A Closer Look

Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.

TAYLOR HOWARD The Employment Interview: A Review of Current Studies and Directions for Future Research.

1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.

Overview DM for Business Intelligence.

Risk Adjustment Data For Business Insight Health Care Service Corporation September 2012.

Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

CS490D: Introduction to Data Mining Prof. Chris Clifton April 14, 2004 Fraud and Misuse Detection.

1 Data Mining DT211 4 Refer to Connolly and Begg 4ed.

© 2010 IBM Corporation © 2011 IBM Corporation September 6, 2012 NCDHHS FAMS Overview for Behavioral Health Managed Care Organizations.

Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.

Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.

Chapter 6 : Software Metrics

Learning Objectives Copyright © 2002 South-Western/Thomson Learning Multivariate Data Analysis CHAPTER seventeen.

McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Duration and Reinvestment Reinvestment Concepts Concepts.

HOW TO WRITE RESEARCH PROPOSAL BY DR. NIK MAHERAN NIK MUHAMMAD.

Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.

Data MINING Data mining is the process of extracting previously unknown, valid and actionable information from large data and then using the information.

1 The findings and conclusions in this presentation are those of the authors and do not necessarily represent the views of the National Institute for Occupational.

Learning Agenda Emotions & Sales Article Sutton & Rafaeli Understanding the phenomenon Conducting an observational study –qualitative & quantitative info.

CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:

© 2012 Towers Watson. All rights reserved. GLM II Basic Modeling Strategy 2012 CAS Ratemaking and Product Management Seminar by Len Llaguno March 20, 2012.

September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.

Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.

Dimension Reduction in Workers Compensation CAS predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.

Predictive Modeling Spring 2005 CAMAR meeting Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc

Applied Quantitative Analysis and Practices LECTURE#31 By Dr. Osman Sadiq Paracha.

Review of Fraud Classification Using Principal Components Analysis of RIDITS By Louise A. Francis Francis Analytics and Actuarial Data Mining, Inc.

Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.

26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.

Principal Component Analysis

Basic Track II 2004 CLRS September 2004 Las Vegas, Nevada.

3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)

Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.

Dr. Bea Bourne 1. 2 If you have any troubles in seminar, please do call Tech Support at: They can assist if you get “bumped” from the seminar.

Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.

Methods of multivariate analysis Ing. Jozef Palkovič, PhD.

26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.

JMP Discovery Summit 2016 Janet Alvarado

Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.

Validating the PRIDIT method for determining hospital quality with outcomes data Robert Lieberthal, PhD, Dominique Comer, PharmD, Katherine O’Connell,

Dimension Reduction in Workers Compensation

Understanding Research Results: Description and Correlation

Sergei Ananyan, Ph.D. Healthcare Fraud Detection through Data Mining Your Knowledge Partner TM (c) Megaputer Intelligence.

Lecture 6: Introduction to Machine Learning

Principal Component Analysis

Presentation transcript:

Disability Pricing and Dental Fraud Detection: Supervised and Unsupervised Learning October 11, 2007 Jonathan Polon FSA Claim Analytics Inc.

Definitions Dental Fraud Disability Pricing Agenda

Definitions

Supervised Learning Known outcome associated with each record in training dataset Eg, Sports betting: predict winners based on observations from past games Objective: build a model to accurately estimate outcomes from predictor variables for each record Commonly referred to as predictive modeling

Unsupervised Learning No known outcome associated with any record in the training dataset Eg, Grocery stores: define types of shoppers and their preferences Learning objective: self-organization or clustering; finding structure in the data

Dental Fraud Detection Using Unsupervised Learning

Dental Fraud Project Overview Explore use of pattern detection tools in detecting dental claim anomalies 2004 data 1,600 Ontario GPs (non-specialists) 200,000 claims Many examples of anomalous activity discovered

Traditional Tools Rule-based Strong at identifying claims that match known types of fraudulent activity Limited to identifying what is known Typically analyze at the level of a single claim, in isolation

Pattern Detection Tools Analyze millions of claims to reveal patterns –Technology learns what is normal and what is atypical – not limited by what we already know Categorize each claim. Reveal dentists with high percentage of atypical claims –Immediately highlight new questionable behaviors –Find new large-dollar schemes –Also identify frequent repetition of small-dollar abuses

Methodology – 3 Perspectives 1. By Claim e.g. Joe Green’s semi-annual check-up 2. By Tooth e.g. all work done in 2004 on Joe’s bicuspid by Dr. Brown 3. By General Work e.g. all general work (exams, fluoride, radiographs, scaling and polishing) done on Joe in 2004 by Dr. Brown

Methodology – 2 Techniques 1. Principal components analysis 2. Clustering

PCACluster Claim by claim Tooth by tooth General work Methodology – Summary

Principal Components Analysis Visualization technique Allows reduction of multi-dimensional data to lower dimensions while maximizing amount of information preserved Powerful approach for identifying outliers

PCA: Simple Example Lots of variance between points around both axes

PCA: Simple Example Create new axes, X’ and Y’: rotate original axes 45º

PCA: Simple Example In the new axes, very little variance around Y’ Y’ contains little “information”

PCA: Simple Example Can set all Y’ values to 0 – ie, ignore Y’ axis Result: reduce to 1 dimension with little loss of info ♦ Original ● Revised

PCA: Identifying Atypical Dentists

1.Begin at the individual transaction level 2.Determine the average transaction for each dentist 3.Graph quickly isolates dentists that are outliers

Dentist Average 90 th percentile PCA: Identifying Atypical Dentists

PCA: Summary 1.Visualization allows for easy and intuitive identification of dentists that are atypical 2.Manual investigation required to understand why a dentist or claim is atypical

Clustering Techniques Categorization tools Organize claims into several groups of similar claims Applicable for profiling dentists by looking at the percentage of claims in each cluster We apply two different clustering techniques –K-Means –Expectation-Maximization

Clustering Example

Example: Clusters found – by Claim 1.Major dental problems 2.Moderate dental problems 3.Age > 28, minor work performed 4.Work includes unbundled procedures 5.Minor work and expensive technologies 6.Age < 28, minor work performed

Clusters: Identifying Atypical Dentists 1. Begin at the individual claim level 2. Calculate the proportion of claims in each cluster – in total, and for each dentist 3. Isolate dentists with large deviations from average

Clusters: Identifying Atypical Dentists

Clustering Techniques: Summary 1.Clustering provides an easy to understand method to profile dentists 2.Unlike PCA, clustering tells why a given dentist is considered atypical 3.Effectively identifies atypical activity at the dentist level

Pilot Project Results PCA identified 68 dentists that are atypical Clusters identified 182 dentists that are atypical 36 dentists are identified as atypical by both PCA and Clusters In total, 214 of 1,644 dentists are identified as atypical (13%) Billings by atypical dentists were $2.5 MM out of $16.1 MM billed by all dentists (15%)

ExamplesOf Atypical Dentists

Analysis: Dentist A is far beyond norms in clusters 6 and 8; both indicate high charges in ‘general dentistry’ What we discovered: Each of Dentist A’s patients is being billed for at least 30 minutes of polishing and 45 minutes of scaling Dentist A

Dentist B Analysis : Dentist B is far beyond norms in cluster 7 What we discovered: Frequent extractions and anesthesia  Dentist B looks like an oral surgeon, yet is a general practitioner

Dentist C Analysis: Very high proportion in Cluster 1, suggesting many high- ticket visits What we discovered: First, Dentist C performs an inordinate amount of scaling. Second, Dentist C has emergency examinations with atypically high frequency

Dentist M Analysis: Dentist M has a disproportionate amount of work beyond the 90 th percentile of work by all dentists on all teeth What we discovered: Dentist M appears to utilize lab work very heavily

More Atypical Dentists Frequent use of panoramic x-rays Frequent use of nitrous oxide – including with procedures rarely associated with anesthesia Large number of extractions, often using multiple types of sedation Very high proportion of claims for crowns and endodontic work Individual instruction on oral hygiene provided with very atypical frequency

Dental Fraud Summary Highly effective in identifying dentists with claim portfolios significantly different from the norm Enables experts to quickly identify and focus on those dentists with atypical claims activity Pattern detection :

Disability Pricing With Predictive Modeling

Protects against loss of income due to illness or injury Annual incidence rates ≈ 3 per 1,000 Claim duration ranges from a few days to a few decades About Group LTD Insurance

Base rates: tabular, depend on age, gender, EP and max benefit duration Several loading factors Demographic info (eg, occupation, area, salary) Group characteristics (eg, group size, industry) Plan features (eg, benefit %, e’e contributions) Typical Rate Structure

Uncover and quantify complex relationships between pricing factors and claim experience Maintain wholeness of data Improved accuracy vs traditional methods Why use PM for pricing?

The Challenge Predictive models can be black boxes, difficult to interpret The Objective A process to make rate structure explainable to: Predictive Modeling + Pricing Management Sales Force Regulators Customers

The 5 Steps - A PM Pricing Project

The Five Steps 1.Set objective 2.Data 3.Select technique and predictors 4.Train the model 5.Validate

Develop a predictive model to: Predict claim incidence rates Predict claim severity, conditional upon claim being made for each member of census data 1. Set objective

Required: Accurate census data for many groups Plan features for these groups (EP, Ben%) Characteristics of these groups (region, industry) Claim data that often can be linked to census 2a. Pull Data

Split Data Into Three Subsets Training Testing Validation 2b. Sample Data

Begin exploring with simpler tools Learn key predictors, relationships Migrate to more complex tools Incorporate exploration learning into model design Use test sample to evaluate models 3. Select technique

4. Train the Model Pricing factors Challenge: Understanding the Model Need to extract pricing factors from “black box”

Approach: Build several models Each using same modeling technique Each using same data Each excluding different predictor Compare actual to predicted claim cost across excluded predictor Understanding the Model Training the Model

Looking into the Black Box Training the Model

Example: model includes all predictors, consider historic claims by industry Demonstrates accuracy of model, but provides no information about claim drivers IndustryActualPredictedAct/Pred Finance$1, % Health$1, % Training the Model

Example: model includes all predictors except industry, consider historic claims by industry Isolates impact of industry from all other factors IndustryActualPredictedAct/Pred Finance$1,600$1,80089% Health$1,200$1,000120% Training the Model

Example: exclude two predictors: region and occ class RegionWhite CollarBlue Collar North East115%95% Mid West75%105% South105%115% Isolates impact by region and occupation class from all other factors Training the Model

Benefits of Extracting Factors 1.Ability to test importance of predictors, singly or in combinations Test predictors that are typically not priced for Quantify impact of each predictor or pair of predictors 2.Facilitates use of the traditional Base Rate Approach Training the Model

Base Rate: Function of age, gender, EP, max duration Loadings: For other demographic info, group characteristics, plan features Determine by looking inside black box Company can adjust final factors Base Rate Approach Training the Model

Base Rate:  Base rate = 1.20 Loadings: Base Rate Example Ben%RegionIndustrySalaryContribFact X 60%NEHealth$60KY %-5%+15%-5%+10%-3% AgeGenderEPMax Dur 45Male180 daysAge 65 Training the Model

Base Rate = 1.20 Loadings = {+5%, -5%, +15%, -5%, +10%, - 3%} Rate = 1.20 *1.05 *0.95 *1.15 *0.95 *1.10 *0.97 Final Rate = Base Rate Example Training the Model

Use excluded data for validation of model Compare actual vs predicted claim costs Ideally, compare predictive modeling approach to traditional approach 5. Pricing Validation

Improved accuracy vs traditional methods Capture effects of interaction between several predictors Able to isolate impact of any predictor or pair of predictors Able to test new predictors for impact on claim cost Can maintain existing Base Rate structure Advantages of PM: Pricing

Questions?