Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Logistic Regression Psy 524 Ainsworth.
Logistic Regression.
Departments of Medicine and Biostatistics
Regression, Correlation. Research Theoretical empirical Usually combination of the two.
Statistical Tests Karen H. Hagglund, M.S.
Multiple Linear Regression Model
Clustered or Multilevel Data
Midterm Review Goodness of Fit and Predictive Accuracy
Today Concepts underlying inferential statistics
Validation of predictive regression models Ewout W. Steyerberg, PhD Clinical epidemiologist Frank E. Harrell, PhD Biostatistician.
Regression and Correlation
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Correlation and Linear Regression
February  Study & Abstract StudyAbstract  Graphic presentation of data. Graphic presentation of data.  Statistical Analyses Statistical Analyses.
Building Risk Adjustment Models Andy Auerbach MD MPH.
Multiple Choice Questions for discussion
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Performance Reports Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics UCSF.
Understanding Multivariate Research Berry & Sanders.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
Understanding Statistics
Division of Population Health Sciences Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Indices of Performances of CPRs Nicola.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Correlation and Regression SCATTER DIAGRAM The simplest method to assess relationship between two quantitative variables is to draw a scatter diagram.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Investment Analysis and Portfolio Management First Canadian Edition By Reilly, Brown, Hedges, Chang 6.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Average Arithmetic and Average Quadratic Deviation.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Discussion of time series and panel models
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Logistic Regression. Linear Regression Purchases vs. Income.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Chapter 16: Correlation. So far… We’ve focused on hypothesis testing Is the relationship we observe between x and y in our sample true generally (i.e.
Logistic Regression Analysis Gerrit Rooks
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Estimation of authenticity of results of statistical research.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
CMS SAS Users Group Conference Learn more about THE POWER TO KNOW ® October 17, 2011 Medicare Payment Standardization Modeling using SAS Enterprise Miner.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
소화기내과 김경엽 Gastroenterology 2011;140:
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Stats Methods at IC Lecture 3: Regression.
Bootstrap and Model Validation
Logistic Regression APKC – STATS AFAC (2016).
Sec 9C – Logistic Regression and Propensity scores
Association between two categorical variables
Multiple logistic regression
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Stats Club Marnie Brennan
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Logistic Regression.
Global PaedSurg Research Training Fellowship
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics

Evaluating Model’s Predictive Power u Linear regression (continuous outcomes) u Logistic regression (dichotomous outcomes)

Evaluating Linear Regression Models u R 2 is percentage of variation in outcomes explained by the model - best for continuous dependent variables –Length of stay –Health care costs u Ranges from 0-100% u Generally more is better

Risk Adjustment Models u Typically explain only 20-25% of variation in health care utilization u Explaining this amount of variation can be important if remaining variation is extremely random u Example: supports equitable allocation of capitation payments from health plans to providers

More to Modeling than Numbers u R 2 biased upward by more predictors u Approach to categorizing outliers can affect R 2 as predicting less skewed data gives higher R 2 u Model subject to random tendencies of particular dataset

Evaluating Logistic Models u Discrimination - accuracy of predicting outcomes among all individuals depending on their characteristics u Calibration - how well prediction works across the range of risk

Discrimination u C index - compares all random pairs of individuals in each outcome group (alive vs dead) to see if risk adjustment model predicts a higher likelihood of death for those who died (concordant) u Ranges from 0-1 based on proportion of concordant pairs and half of ties

Adequacy of Risk Adjustment Models u C index of 0.5 no better than random u C index of 1.0 indicates perfect prediction u Typical risk adjustment models

C statistic u Area under ROC curve for a predictive model no better than chance at predicting death is 0.5 u Models with improved prediction of death by –0.5 SDs better than chance results in c statistic =0.64 –1.0 SDs better than chance resutls in c statistic = 0.76 –1.5 SDs better than chance results in c statistic =0.86 –2.0 SDs better tha chance results in c statistic =0.92

Best Model Doesn’t Always Have Biggest C statistic u Adding health conditions that result from complications will raise c statistic of model but not make the model better for predicting quality.

Spurious Assessment of Model Performance u Missing values can lead to some patients being dropped from models u Be certain when comparing models that the same group of patients is being used for all models otherwise comparisons may reflect more than model performance

Calibration - Hosmer-Lemeshow u Size of C index does not indicate how well model performs across range of risk u Stratify individuals into groups (e.g. 10 groups) of equal size according to predicted likelihood of adverse outcome (eg death) u Compare actual vs expected outcomes for each stratum u Want a non significant p value for each stratum and across strata (Hosmer-Lemeshow statistic)

Hosmer-Lemeshow u For k strata the chi squared has k-2 degrees of freedom u Can obtain false negative (non significant p value) by having too few cases in a stratum

Calculating Expected Outcomes u Solve the multivariate model incorporating an individual’s specific characteristics u For continuous outcomes the predicted values are the expected values u For dichotomous outcomes the sum of the derived predictor variables produces a “logit” which can be algebraically converted to a probability u (e nat log odds /1 + e nat log odds )

Individual’s CABG Mortality Risk u 65 y.o obese non white woman with diabetes and serum creatinine of 1 mg/dl presents with an urgent need for CABG surgery. What is her risk of death?

Individual’s Predicted CABG Mortality Risk u 65 y.o obese non white woman with diabetes presents with an urgent need for CABG surgery. What is her risk of death? u Log odds = (0.06) (1.15) +.09 = 3.39 u Probability of death = 0.034/1.034=3.3%

Observed CABG Mortality Risk u Actual outcome of whether individual lived or died u Observed rate for a group is number of deaths per the number of people in that group

Actual and Expected CABG Surgery Mortality Rates by Patient Severity of Illness in New York Chi squared p=.16

Goodness-of-fit tests for AMI mortality models

Stratifying by Risk u Hosmer Lemeshow provides a summary statistic of how well model is calibrated u Also useful to look at how well model performs at extremes (high risk and low risk)

Validating Model – Eye Ball Test u Face validity/Content validity u Does empirically derived model correspond to a pre- determined conceptual model? u If not is that because of highly correlated predictors? A dataset limitation? A modeling error?

Validating Model in Other Datasets: Predicting Mortality following CABG STSNYVADukeMN C statistic Jones et al, JACC, 1996

Recalibrating Risk Adjustment Models u Necessary when observed outcome rate different than expected derived from a different population u This could reflect quality of care or differences in coding practices u Assumption is that relative weights of predictors to one another is correct u Recalibration is an adjustment to all predictor coefficients to force average expected outcome rate to equal observed outcome rate

Recalibrating Risk Adjustment Models u New York AMI mortality rate is 15% u California AMI mortality rate is 13% u Is care or coding different? u If want to use New York derived risk adjustment model to predict expected deaths in California need to adjust predictors (eg multiply by 13/15)

Summary u Summary statistics provide a means for evaluating the predictive power of multivariate models u Care should be taken to look beyond summary statistics to ensure that the model is not overspecified and that it conforms to a conceptual model u Models should be validated with internal and ideally external data u Next time we will review how a risk-adjustment model can be used to identify providers who perform better and worse than expected given their patient mix