Discussion/Presentation of Park and Basu: “Alternative Evaluation Metrics for Risk Adjustment Models” Stephen P. Ryan, Olin.

Slides:



Advertisements
Similar presentations
Brief introduction on Logistic Regression
Advertisements

Unsupervised Learning
Evaluating Inforce Blocks Of Disability Business With Predictive Modeling SOA Spring Health Meeting May 28, 2008 Jonathan Polon FSA
Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva CALD Masters Presentation 19 August 2002 Advisors: Alan Montgomery,
Chapter 5. Merchandisers Cost of Goods Sold Manufacturers Direct Material, Direct Labor, and Variable Manufacturing Overhead Merchandisers and Manufacturers.
Decision and Risk Analysis Financial Modelling & Risk Analysis II Kiriakos Vlahos Spring 2000.
Health Insurance October 19, 2006 Insurance is defined as a means of protecting against risk. Risk is a state in which multiple outcomes are possible and.
S519: Evaluation of Information Systems
Using Machine Learning to Model Standard Practice: Retrospective Analysis of Group C-Section Rate via Bagged Decision Trees Rich Caruana Cornell CS Stefan.
Clustered or Multilevel Data
Chapter 11 Multiple Regression.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Regression and Correlation Methods Judy Zhong Ph.D.
More Machine Learning Linear Regression Squared Error L1 and L2 Regularization Gradient Descent.
Risk Adjustment Data For Business Insight Health Care Service Corporation September 2012.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’ Compensation Insurance David Speights Senior Research Statistician HNC.
User Interests Imbalance Exploration in Social Recommendation: A Fitness Adaptation Authors : Tianchun Wang, Xiaoming Jin, Xuetao Ding, and Xiaojun Ye.
Limited Dependent Variables Ciaran S. Phibbs May 30, 2012.
Estimating the Predictive Distribution for Loss Reserve Models Glenn Meyers Casualty Loss Reserve Seminar September 12, 2006.
Managerial Economics Demand Estimation & Forecasting.
Consumer Valuation of Medicare Part D Plans VERY PRELIMINARY-Please do not quote Claudio Lucarelli Dept of Policy Analysis and Management, Cornell University.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Data Mining in Finance, 1999 March 8, 1999 Extracting Risk-Neutral Densities from Option Prices using Mixture Binomial Trees Christian Pirkner Andreas.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
BCS547 Neural Decoding.
Risk Diversification and Insurance
Machine Learning 5. Parametric Methods.
Stock market forecasting using LASSO Linear Regression model
Adverse Selection. What Is Adverse Selection Adverse selection in health insurance exists when you know more about your likely use of health services.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Non-Linear Dependent Variables Ciaran S. Phibbs November 17, 2010.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 6: Multiple Regression – Additional Topics
Logic of Hypothesis Testing
Oliver Schulte Machine Learning 726
Financial Risk Management of Insurance Enterprises
Comparing Systems Using Sample Data
KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional.
Making inferences from collected data involve two possible tasks:
An Empirical Comparison of Supervised Learning Algorithms
26134 Business Statistics Week 5 Tutorial
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Correlation and Simple Linear Regression
Contextual Intelligence as a Driver of Services Innovation
Author: Konstantinos Drakos Journal: Economica
Machine Learning for dotNET Developer Bahrudin Hrnjica, MVP
A Logit model of brand choice calibrated on scanner data
Correlation and Simple Linear Regression
Chapter 6: Multiple Regression – Additional Topics
Dr. Morgan C. Wang Department of Statistics
Administration/Finance
Martijn Schuemie, Peter Rijnbeek, Jenna Reps, Marc Suchard
Multiple Regression Analysis: Further Issues
Correlation and Simple Linear Regression
10701 / Machine Learning Today: - Cross validation,
What is Regression Analysis?
Lecture 4: Econometric Foundations
The Bias Variance Tradeoff and Regularization
Overfitting and Underfitting
Simple Linear Regression and Correlation
Lifecycle Deficit (Consumption & Labor Income)
Risk Adjustment Network Meeting. The Hague. October 11-14, 2017
Model generalization Brief summary of methods
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Introduction to Machine learning
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Discussion/Presentation of Park and Basu: “Alternative Evaluation Metrics for Risk Adjustment Models” Stephen P. Ryan, Olin

The Really Big Picture Insurance: third-party can improve utility of risk-averse agents by equating marginal utilities across probabilistic states of the world Not actually what health insurance looks like in the US Adverse selection: agents know health type better than insurer Screening: set up menu of options to induce agents to reveal type Competition: more firms -> lower prices, but less pooling Risk adjustment: intervention to make everyone equally profitable Requirement: gov’t / firm must be able to compute E[Y|X], conditional distribution of expenditures given menu / observables

This Paper Using 2013-4 MarketScan database, predict risks using several techniques: Nine parametric regressions Seven machine learning algorithms Three distributional estimators Consider several metrics: Group-level Individual-level Tail distributions (these are where your expenses come from)

Findings No one method dominates Parametric methods better tail- and individual-level prediction Machine learning, distributional methods better at group-level prediction Assertion: tradeoff between modeling individual risks and group level Assertion: optimal method must account for insurer behavior

Metrics Fraction of population outside l percentage of predicted: 𝑌 𝑖 is forecast (think of draw from distribution) This is a bit of a strange object, as it doesn’t try to match the distribution of 𝑦 𝑖 Related object on tails:

Data Truven MarketScan Commercial Claims and Encounter database 2013-2014 Validates the risk adjustment model used in ACA Look at working-age adults 21-64 continuously enrolled in medium and large employer health plans Outcome: total expenditures, including patient payments Goal: using data on prior year expenditures, plus 18 age-sex categories, 114 HCCs, 16 interactions with disease groups, and plan-type and state FE -> predict second year expenditures Authors split data into estimation / prediction

Summary Statistics

Predictive Models Linear regression Generalized beta of second kind Five parameter distribution to match distribution of expenditures Account for zero expenditures using a second probability function (logit) Fixed mixture model Machine learning methods Regularized linear regression: LASSO, ridge regression, elastic net, LARS Artificial neural network Decision tree Super learner Distributional methods Ordered logit Logit ???

Results

Tail Performance

Individual-level Predictions

Conclusions No one method dominates across several different metrics But:

Comments The metrics reported are non-intuitive to me We care about matching the distribution of expenditures One should use the integrated mean square error across distribution Bigger issue: Distributions filtered through plans Brings up question of what exactly we are trying to predict Performance between aggregate and individual reflects this issue No post-selection estimation -> Chernozhukov and co-authors have explored this issue extensively The issue of errors in some models not treated uniformly (e.g. normality in ML)

Comment: Mixture Models In Fox, Kim, Ryan, and Bajari (2011), we show how to estimate mixture models using linear regression Instead of fixing number (and position) of types and estimating weights, you can recover both using linear regression Could expand the range of your FMM to nonparametric joint estimation of types

Thank you! This is an important topic and I enjoyed reading the paper Also, learned about the SUPER method which we inadvertently repeated in 2015 (oops!)