Modeling Diabetic Hospitalizations for the TennCare Population Application of Predictive Modeling for Care Management Panel AcademyHealth Annual Research Meeting June 28, 2005 Boston Avery Ashby MS Soyal Momin MS, MBA Raymond Phillippi PhD Allen Naidoo PhD Judy Slagle RN, MPA
Background
BlueCross BlueShield of Tennessee provides care management programs for members with certain chronic illnesses or conditions. Care managers are licensed nurses. Diabetes is a prevalent chronic illness affecting our managed TennCare population. Modeling of diabetic inpatient hospitalizations can help in identifying and directing those members at higher risk to care management. Management Programs
Methodology
Diabetic members were identified using member level claims data. Data were collected for continuously enrolled diabetic members for the time period of July 1, 2001 through June 30, Year 1 member specific data were used to model whether a diabetic hospitalization occurred in Year 2. Logistic regression was employed to model the probability of a diabetic hospitalization in Year 2. Study Design Time Period Year 1 July 1, 2001 – June 30, 2002 Member Specific Data Year 2 July 1, 2002 – June 30, 2003 Diabetic Hospitalization?
Data Elements
Gender Age Zip Code Metropolitan & Rural Region Multiple Regions Eligibility Medicaid subcategories not including dual-eligible members Demographics
Diabetic Hospitalizations Emergency Room Encounters Ophthalmologist Encounters Primary Care Physician (PCP) Encounters Endocrinologist Encounters Total Specialist Encounters Utilization
Insulin Prescriptions Prescribed or Not Misc. Anti-diabetic Prescriptions Prescribed or Not Sulfonylurea Prescriptions Prescribed or Not Caloric Agents Prescribed or Not Total Prescriptions (Any variety) Pharmacy
Cholesterol Screening Received or Not Eye Examination Received or Not Microalbuminuria Screening Received or Not HbA1c Screening Received or Not Evidence Based Guidelines
Insulin Dependency Dependent or Not Total Co-morbidities Diagnostic Cost Grouper (DCG) Risk Score Diagnosis and Risk Score
Members: 11,002 (313 Year 2 Hospitalizations) Gender: Female 64.7% Age: Mean 47 Median 50 General Data Characteristics
Predictive Model
Probability of hospitalization = 1/(1+e -z ) Where z = ( * Diabetic Hospitalizations) – ( * No Insulin prescribed) – ( * Age) +( * Diagnostic Cost Grouper Risk Score) + ( * No Misc. Anti-diabetic prescribed) + ( * Ophthalmologist Encounters) – ( * Primary Care Physician Encounters) – ( * Non-Insulin Dependent) + ( * Emergency Room Encounters) – ( * Total Specialist Encounters) Model Specifics
Sensitivity vs. Specificity
Odds Ratio Estimates Model Specifics Covariate Odds Ratio Lower Limit Upper Limit Diabetic Hospitalizations Insulin PrescribedNo vs. Yes Age Diagnostic Cost Grouper Risk Score Misc. Anti-diabetic PrescribedNo vs. Yes Ophthalmologist Encounters Primary Care Physician Encounters Insulin DependencyNo vs. Yes Emergency Room Encounters Total Specialist Encounters
Diagnostics CovariateTolerances * Diabetic Hospitalizations0.92 Insulin Prescriptions0.19 Age0.91 Diagnostic Cost Grouper Risk Score0.66 Anti-diabetic Prescriptions0.94 Ophthalmologist Encounters0.94 Primary Care Physician Encounters0.71 Insulin Dependency0.19 Emergency Room Encounters0.54 Total Specialist Encounters0.40 *Tolerance is 1- R 2 x, where R 2 x is the variance in each covariate, X, explained by all of the other covariates.
Goodness of Fit
Model Performance 97.5%Negative Predictive Value (NPV) 0.223Pseudo-R %Correct Prediction Rate 12.1%Sensitivity 99.7%Specificity 52.8%Positive Predictive Value (PPV) 10,689 10, No Stay Actual 11,002Totals 10,930No Stay 72 Prediction TotalsStay
Rational Artificial Intelligence
An artificial Neural Network (ANN) was trained and validated on the entire data set. Problematic because the ANN tried to maximize the overall correct prediction rate. Similar results to logistic regression models. Initial RAI Results
RAI Model Performance 97.5%Negative Predictive Value (NPV) N/APseudo-R %Correct Prediction Rate 10.9%Sensitivity 99.9%Specificity 82.9%Positive Predictive Value (PPV) 10,689 10,682 7 No Stay Stay Actual 11,002Totals 10,961No Stay 41Stay Prediction Totals
Collect equal samples from hospitalized and non-hospitalized members. Build ANN based on this 1:1 (150:150) training data set. Validate ANN on remaining Out-of-Sample members. Repeat process to ensure that the overall pattern is accounted for. Develop credibility intervals for sensitivity, specificity, PPV, and NPV based on this repeated process. Forced Learning Solution
Results of repeated forced learning method were collected. 95% credibility intervals were derived from MCMC simulation using WinBUGS 1.4. [4.11%,4.49%] Positive Predictive Value (PPV) [98.36%,98.73%][76.06%,78.13%][66.00%,70.80%] Negative Predictive Value (NPV) SpecificitySensitivity Forced Learning Model Performance
Research Implications
Begins with the question of allocated resources. Logistic regression model and ANN identified a small percentage of members with an actual Year 2 hospitalization with a “reasonable” PPV. ANN using the Forced Learning Method identified a much larger percentage of members with an actual Year 2 hospitalization with a low PPV. Finding a Balance
Coverage No hospitalization hospitalization No hospitalization hospitalization Logistic Regression Model Forced Learning ANN Predicted hospitalization
Other covariates like lab values, Health Risk Assessments (HRAs), and psychological indicators. Using a meta-model where clusters of homogenous sub-groups are modeled separately [and possibly] with differing methods. Model probability of co-morbid condition related hospitalizations instead of diabetic hospitalizations. Future Considerations
Avery Ashby MS Senior Research Analyst Health Intelligence Group 801 Pine Street – 3E Chattanooga, TN p f Contact Information