Presentation is loading. Please wait.

Presentation is loading. Please wait.

Phenotype generation from EMR by tensor factorization SEDI Durham Cohort James Lu M.D. Ph.D. Department of Electrical and Computer Engineering Department.

Similar presentations


Presentation on theme: "Phenotype generation from EMR by tensor factorization SEDI Durham Cohort James Lu M.D. Ph.D. Department of Electrical and Computer Engineering Department."— Presentation transcript:

1 Phenotype generation from EMR by tensor factorization SEDI Durham Cohort James Lu M.D. Ph.D. Department of Electrical and Computer Engineering Department of Medicine

2 3.2 Trillion / yr (~21% of GDP) Health System Under Pressure

3 Small Molecules, Medical Devices, Biologics, diagnostics, genomics, transcriptomics…. OperationsNovel technology Align incentives, risk sharing, quality metrics, reducing readmissions, six sigma/ lean, … Where do I achieve cost arbitrage? How do we identify which patients to study? Where is my patient going to do next? Can we reorganize patient flow?

4 Computable phenotypes are a top down process PheKB, Northwestern

5 Many variations of computable phenotypes require adjudication by physicians. Richesson, et al. 2013 Expensive and time consuming

6 EMR Data is large and Complicated Durham County, 2007-2011 Patient level  >240,000 patients  Birthday  Death (where available)  Gender  Race  Ethnicity Visit level  4.4 Million patient visits  Average 18 measurements recorded per visit  Indicator of presence/absence of particular diseases (computed)  Encounter date (start, end)  Location (DHRH, DUH, DRH)  Path (ED -> inpatient for example)  Inpatient / Outpatient > 60,000 types of observations CPT ICD9 diagnoses ICD9 procedures Lab values Medications Vitals Intervention level Caveats: Temporal gaps – People are only patients when they are sick We want to incorporate all of this information Don’t want to be fooled by mistakes and bias

7 Decompose each touch with the health care system into its parts ● Each visit is a 5-D tensor (~1 billion elements) ● Patient ● Diagnosis/ Billing Codes ● Labs ● Medications ● Time ● Model as Counts ● Decompose into set of K rank 1 vectors With Piyush Rai and Changwei Hui Codes Labs Medications Time

8 Computational phenotypes are a bottom-up process. Factors represent latent phenotypes Evaluate 11242 pts with ~23MM data-points with morbidity outcomes in diabetes Alprazolam Urate Factor 2 Factor 10 Malignant Neoplasm Prostate Clinical Trial Participation Secondary Malignant Neoplasms of Bone External Catheter Set CEA AG 15-3 Allopurinol Evening Primrose Oil Systemic Lupus Erythematosus Side Effects from Statins Shoulder Pain Calcidiol Jo-1

9 Patients are composites of common and rare latent phenotypes. ER/ EKG Standard Labs (i.e. CBC/ BMP) Kidney Disease Hypertension Surgical Patient Patient by Factor Score Matrix, 40 most common phenotypes

10 Compare Outcome prediction to Known Algorithm (UKPDS)  UKPDS: UK Prospective Diabetes Study outcomes model used to predict MI, Death, and Stroke  7 demographic + lab variables:  age, ethnicity, smoking status  A1c, HDL, Total Cholesterol and Systolic BP  Dataset  Original 7 variable model  All Data  Non Matrix Factorization  Tensor Factorization  Can we predict outcome in next year  Death  AMI  Stroke  Classification Model:  Fit data with Random Forests  10 fold cross validation With Joseph Lucas

11 Tensor derived factors performs better than original UKPDS in all outcomes, provides comparable performance to “all-data” model Stroke is similar to Dat


Download ppt "Phenotype generation from EMR by tensor factorization SEDI Durham Cohort James Lu M.D. Ph.D. Department of Electrical and Computer Engineering Department."

Similar presentations


Ads by Google