Risk Adjustment Network Meeting. The Hague. October 11-14, 2017

Slides:



Advertisements
Similar presentations
Statistical Analysis and Data Interpretation What is significant for the athlete, the statistician and team doctor? important Will Hopkins
Advertisements

Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Hypothesis Testing: One Sample Mean or Proportion
BA 555 Practical Business Analysis
Clustered or Multilevel Data
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Electronic Medical Record Use and the Quality of Care in Physician Offices National Conference on Health Statistics August 17, 2010 Chun-Ju (Janey) Hsiao,
Simple Linear Regression
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
Prediction-based Threshold for Medication Alert Yoshimasa Kawazoe 1 M.D., Ph.D., Kengo Miyo 1,2 Ph.D., Issei Kurahashi 2 Ph.D., Ryota Sakurai 1 M.D. Kazuhiko.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
June 9, 2008 Making Mortality Measurement More Meaningful Incorporating Advanced Directives and Palliative Care Designations Eugene A. Kroch, Ph.D. Mark.
How Much Would A Medicare Prescription Drug Benefit Cost? Offsets in Medicare Part A Cost by Increased Drug Use Zhou Yang, Ph.D. Assistant Professor Department.
Obesity, Medication Use and Expenditures among Nonelderly Adults with Asthma Eric M. Sarpong AHRQ Conference September 10, 2012.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Rehospitalization Analytics: Modeling and Reducing the Risks of Rehospitalization Chandan K. Reddy Department of Computer Science, Wayne State University.
Chong Ho Yu.  Data mining (DM) is a cluster of techniques, including decision trees, artificial neural networks, and clustering, which has been employed.
Introduction. We want to see if there is any relationship between the results on exams and the amount of hours used for studies. Person ABCDEFGHIJ Hours/
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.
Methods of Presenting and Interpreting Information Class 9.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Kelci J. Miclaus, PhD Advanced Analytics R&D Manager JMP Life Sciences
Innovative methods in assessments / surveys for challenging settings
Chapter 15 Multiple Regression Model Building
C Wilson, KM Rhodes, RA Payne
Linear Regression with One Regression
KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional.
Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.
Disparities in process and outcome measures among adults with persistent asthma David M. Mosen, PhD, MPH; Michael Schatz, MD, MS; Rachel Gold, PhD; Winston.
David Radley and Cathy Schoen
Discussion/Presentation of Park and Basu: “Alternative Evaluation Metrics for Risk Adjustment Models” Stephen P. Ryan, Olin.
Carina Omoeva, FHI 360 Wael Moussa, FHI 360
Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.
Chapter 9 Audit Sampling 1.
The Diabetic Retinopathy Clinical Research Network
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Trena M. Ezzati-Rice, Frederick Rohde, Robert Baskin
Rose Krebill-Prather, PhD
Development and Validation of HealthImpactTM: An Incident Diabetes Prediction Model Based on Administrative Data Rozalina G. McCoy, M.D.1, Vijay S. Nori,
Strategies to incorporate pharmacoeconomics into pharmacotherapy
Conclusions Context Long-Term Conditions Questionnaire Results
Machine learning in Action: Unpacking the Biographical Questionnaire
University of Witwatersrand, Johannesburg, South Africa
Understanding Standards Event Higher Statistics Award
ASPIRE Workshop 5: Application of Biostatistics
Analytics in Higher Education: Methods Overview
ASPIRE Workshop 5: Application of Biostatistics
United Nations Development Account 10th Tranche Statistics and Data
Impact evaluation: The quantitative methods with applications
L. Isella, A. Karvounaraki (JRC) D. Karlis (AUEB)
Dr. Morgan C. Wang Department of Statistics
Contrasts & Statistical Inference
Network Screening & Diagnosis
BMC Health Service Research 2015 By Gang Nathan Dong PERFORMING WELL IN FINANCIAL MANAGMGMENG AND QUALITY OF CARE.
Introduction to Predictive Modeling
1/18/2019 ST3131, Lecture 1.
Vice President, Health Care Coverage and Access
Scalable and accurate deep learning with electronic health
Parametric Methods Berlin Chen, 2005 References:
Contrasts & Statistical Inference
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Willard G. Manning et al. (1987) June 1, 2007 Willard G.
ASPIRE Workshop 5: Application of Biostatistics
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Qi Li,Qing Wang,Ye Yang and Mingshu Li
Presentation transcript:

Risk Adjustment Network Meeting. The Hague. October 11-14, 2017 Risk Adjustment, Big Data and Machine Learning; Challenge and Opportunity Dov Chernichovsky, Ben Gurion University of the Negev, Israel Alvaro Riascos, Los Andes University, Colombia Ran Bergman, Deloitte, Israel Risk Adjustment Network Meeting. The Hague. October 11-14, 2017

Goal of presentation Prompt further discussion about the role of big data and machine learning in RA

Rationale The technology is there, and evolving fast Insurers and plans, at least in Israel, are using it for assessing risk and potentially for risk selection Can induce government (e.g., Israel, Colombia) to use more rigorous risk adjustment mechanism than those used today

Presentation Introduction to Big Data and Machine Learning The case of Israel The case of Colombia Conclusion

Big Data – Its 4 V’s

Major Types of Data Traditional: “Omics” – data at the cellular level Electronic medical records (e.g., billing, physician visits, measurements, lab tests, prescriptions and purchases) Claims data Patient and MDs’ surveys Registry data “Omics” – data at the cellular level Genomic and genetic data Patient generated data Social media Sensors

Machine Learning and Conventional Methods Handle different types of data, including numbers, text, and/or visual images, that are of different dimensions and structures Detect and learn fast, through computer algorithms, highly complex and intricate relationships in high-dimensional data Do not pre-impose constraints on the relationship between inputs and outcomes   Fewer assumptions in a non-parametric statistical model Need Structured data Slow Resource consuming Require pre-specified modeling and parameters

Goals of Machine Learning Predict an outcome Measure success by ‘out of sample performance’ Take advantage of rich and complex data that entail outcomes that are determined by many potential predictors with complex interrelationships

Issues (Still) imprecision and ambiguity of medical data Validation of model, for policy making predicts well an outcome (e.g., 90 percent of the time) but has a high false positive rate The incentives that the calculations produce

Some Evidence Ross S. (2016) argues that a simplified risk adjustment formula selected via this nonparametric framework maintains much of the efficiency of a traditional larger formula. The ensemble approach also outperformed classical regression and all other algorithms studied. Buchner, F., Wasem, J. & Schillo, S. (2015) show that including interactions from a machine learning algorithm improves  the adjusted R2 from 25.43% to 25.81% on the evaluation data set. Predictive ratios are calculated for subgroups affected by the interactions. The R2 improvement detected is only marginal. Li et al. (2013) argue that the non linear relations among risk factors is usually very difficult to capture with linear models. The random forest model reaches a R2 of 38% with an standard deviation on 0.008 while the linear regression model reaches a R2 of 31% with a standard deviation on 0.01

Based on confidential estimates of a sickness fund Israel Based on confidential estimates of a sickness fund

Colombia (as in Israel) Adds variables to current formula and explores potential interactions Estimates with conventional and machine learning methods

Colombia – Step I: New Specification Add to current state formula, based on linear regression of gender, age groups, location, and their two-way interactions (UPC): enrollees' morbidity characterized by 29 long-term disease groups (Dx) the severity of health condition using indicators of hospitalizations (H) and consultation with specialists (E), and admission to an intensive care unit (U) UPC + Dx + H + E + U UPC * H * E * U + Dx

Data Panel of claims -- 2010 and 2011

Estimation Models The linear model estimated through weighted least squares Three machine learning models: Artificial neural networks (ANN) Random forests (RF) Boosted trees (GBM) To control for a selection bias, some specification of the machine learning models include an additional regressor: the probability of claiming a service since 20% of enrollees do not claim any service during the year Negative predictions of machine learning models were truncated at zero.

Criteria for Evaluating Models

Data

Entire Distribution Estimate

Lowest Quintile Estimate

Upper Quintile Esimate

Conclusions (for Colombia and Israel) Risk adjustment policy can redistribute resources more efficiently by adjusting for the enrollees' health conditions and by using non parametric specifications that capture better than the linear models the non linear relation between risk factors The non-parametric machine learning approach appears (in Colombia) to perform well for the entire cost distribution, poorly at its lower end, and well at the its higher end The Israeli political economy associated with Big Data and Machine Learning suggests that sickness funds “are tempted” to use the new technological options for risk selection

References Buchner, F., Wasem, J., & Schillo, S. 2015. “Regression Trees Identify Relevant Interactions: Can This Improve the Predictive Performance of Risk Adjustment?” Health economics 26 (1):74-85 Li, L., Bagheri, S., Goote, H., Hasan, A., & Hazard, G. 2013. Risk adjustment of patient expenditures: A Big Data Analytics Approach. Ieee international conference on Big data. 2013 (pp. 12{14) Rose S. 2016. “A Machine Learning Framework for Plan Payment Risk Adjustment.” Health Services Research 51:6, Part I

Some more food for thought…….. Thanks !