Download presentation
Presentation is loading. Please wait.
Published byShannon Cameron Modified over 9 years ago
1
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics
2
Goals of Risk-Adjustment u Account for pertinent patient characteristics before making inferences about effectiveness, efficiency, or quality of care u Minimize confounding bias due to nonrandom assignment of patients to different providers or systems of care u Confirm the importance of specific predictors
3
Why Risk-Adjustment? u Monitoring and comparing outcomes of care (death, readmission, adverse events, functional status, quality of life) u Monitoring and comparing utilization of services and resources (LOS, cost) u Monitoring and comparing patient satisfaction u Monitoring and comparing processes of care
4
How is Risk Adjustment Done u On large datasets u Uses measured differences in compared groups u Model impact of measured differences between groups on variables shown, known, or thought to be predictive of outcome so as to isolate effect of predictor variable of interest
5
When Risk-Adjustment May Be Inappropriate u Processes of care which virtually every patient should receive (e.g., immunizations, discharge instructions) u Adverse outcomes which virtually no patient should experience (e.g., incorrect amputation) u Nearly certain outcomes (e.g., death in a patient with prolonged CPR in the field) u Too few adverse outcomes per provider
6
When Risk-Adjustment May Be Unnecessary u If inclusion and exclusion criteria can adequately adjust for differences u If assignment of patients is random or quasi- random
7
When Risk-Adjustment May Be Impossible u If selection bias is an overwhelming problem u If outcomes are missing or unknown for a large proportion of the sample u If risk factor data (predictors) are extremely unreliable, invalid, or incomplete
8
Data Sources for Risk-Adjustment u Administrative data are collected primarily for a different purpose, but are commonly used for risk- adjustment u Medical records data are more difficult to use, but contain far more information u Patient surveys may complement either or both of the other sources
9
Advantages of Administrative Data u Universally inclusive population-based u Computerized, inexpensive to obtain and use u Uniform definitions u Ongoing data monitoring and evaluation u Diagnostic coding (ICD-9-CM) guidelines u Opportunities for linkage (vital stat, cancer)
10
Disadvantages of Administrative Data u Missing key information about physiologic and functional status u No control over data collection process u Quality of diagnositc coding varies across hospitals u Incentives to upcode (DRG creep), possibly avoid coding complications u Inherent limitations of ICD-9-CM
11
Doing Your Own Risk-Adjustment vs. Using an Existing Product u Is an existing product available or affordable? u Would an existing product meet my needs? - Developed on similar patient population - Applied previously to the same condition or procedure - Data requirements match availability - Conceptual framework is plausible and appropriate - Known validity
12
Conditions Favoring Use of an Existing Product u Need to study multiple diverse conditions or procedures u Limited analytic resources u Need to benchmark performance using an external norm u Need to compare performance with other providers using the same product u Focus on resource utilization, possibly mortality
13
A Quick Survey of Existing Products Hospital/General Inpatient u APR-DRGs (3M) u Disease Staging (SysteMetrics/MEDSTAT) u Patient Management Categories (PRI) u RAMI/RACI/RARI (HCIA) u Atlas/MedisGroups (MediQual) u Cleveland Health Quality Choice u Public domain (MMPS, CHOP, CSRS, etc.)
14
A Quick Survey of Existing Products Intensive Care u APACHE u MPM u SAPS u PRISM
15
A Quick Survey of Existing Products Outpatient Care u Resource-Based Relative Value Scale (RBRVS) u Ambulatory Patient Groups (APGs) u Physician Care Groups (PCGs) u Ambulatory Care Groups (ACGs)
16
How Do Commercial Risk- Adjustment Tools Perform u Better predictor of use/death than age and sex u Better retrospectively (~30-50% of variation) than prospectively (~10-20% of variation) u Lack of agreement among measures u More than 20% of in-patients assigned very different severity scores depending on which tool was used (Iezzoni, Ann Intern Med, 1995)
17
Building Your Own Risk- Adjustment Model u Previous literature u Expert opinion - Generate specific hypotheses, plausible mechanisms - Translate clinically important concepts into measurable variables (e.g., cardiogenic shock) - Separate factors that could be risk for disease or complication of treatment u Data dredging (retrospective)
18
Empirical Testing of Risk Factors u Univariate/bivariate analyses to eliminate low frequency, insignificant, or counterintuitive factors u Test variables for linear, exponential,or threshold effects u Test for interactions
19
Potential Risk Factors for CABG Outcomes u Age, gender, race, ht, wt, BMI, u Ejection fraction, NY Heart Class, # of vessels u Comorbidity - hypertension, chf, copd, dm, hepatic failure, renal failure, calcified aorta u Acute treatment/complications - IABP, thrombolysis, PTCA, PTCA complication, hemodynamic instability u Past hx - previous surgery, PTCA, MI, stroke, fem-pop u Behaviors - smoking
20
Significant Risk Factors for Hospital Mortality for Coronary Artery Bypass Graft Surgery in New York State, 1989-1992
22
Risk Factors in Large Data Sets: Can you have too much power? u Clinical vs. statistical importance u Risk of overfitting, and need for a comprehensible model, mandate data reduction u Consider forcing in clinically important predictors
23
Evaluating Model Quality u Linear regression (continuous outcomes) u Logistic regression (dichotomous outcomes)
24
Evaluating Linear Regression Models u R 2 is percentage of variation in outcomes explained by the model - best for continuous dependent variables u Ranges from 0-100% u Generally more is better but biased upward by more predictors u Sometimes explaining a small amount of variation is still important
25
Evaluating Logistic Models u c statistic - compares all random pairs of individuals in each outcome group (alive vs dead) to see if risk adjustment model predicts a higher likelihood of death for those who died u Ranges from 0-1 u c value of 0.5 means model is no better than random u c value of 1.0 indicates perfect performance
26
How well model predicts outcomes across range of risks - Hosmer-Lemeshow u Stratify individuals into groups (e.g. 10 groups) of equal size according to predicted likelihood of adverse outcome (eg death) u Compare actual vs predicted deaths for each stratum u Hosmer-Lemeshow chi-square statistic (8 degrees of freedom for 10 deciles) u Trying to demonstrate a non significant p value
27
Actual and Expected Mortality Rates for Different Levels of Patient Severity of Illness Chi squared p=.16
28
Goodness-of-fit tests for AMI mortality models OSHPD: AMI Outcomes Project, 1996
29
Aggregating to the group level u Sum observed and predicted events u Statistical problems arise when total number of predicted events are small u Assuming chi-squared comparisons of groups testing minimum of five expected events per group as a rule of thumb
30
Comparing observed and expected outcomes u Observed events or rates of events u Expected events or rates of events u Risk adjusted events or rates = site specific(observed/expected) X average observed across all sites
31
Validating Model u Face validity/Content validity u Gold standard = external validation with new data u Separate development and validation data sets - Randomly split samples - Samples from different time periods/areas u Re-estimate model using all available data
32
Bootstrap Procedure: “If things had been a little different” u Multiple (e.g. 1000) random samples derived from original sample with replacement u Estimate model’s performance in each new random sample u Can derive C.I.s of model coefficients from empirical results of “new samples”
33
Consistency in the evidence u Similar findings over time help to rule out random effects u Differences between observed and expected may be due to things other than ‘quality’ u Confirmation through very different types of evidence is a major goal u View the risk adjusted estimates as ‘yellow flags’, not ‘smoking guns’
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.