Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Jyotishman Pathak, PhD.

Similar presentations


Presentation on theme: "Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Jyotishman Pathak, PhD."— Presentation transcript:

1 Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics June 11, 2012

2 SHARPn High-Throughput Phenotyping Project 3: Collaborators & Acknowledgments CDISC (Clinical Data Interchange Standards Consortium) Rebecca Kush, Landen Bain Centerphase Solutions Gary Lubin, Jeff Tarlowe Group Health Seattle David Carrell Harvard University/MIT Guergana Savova, Peter Szolovits Intermountain Healthcare/University of Utah Susan Welch, Herman Post, Darin Wilcox, Peter Haug Mayo Clinic Cory Endle, Rick Kiefer, Sahana Murthy, Gopu Shrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor ©2012 MFMER | slide-2

3 SHARPn High-Throughput Phenotyping

4 Phenotyping is still a bottleneck… ©2012 MFMER | slide-4 [Image from Wikipedia]

5 SHARPn High-Throughput Phenotyping EHR systems: United States 2002—2011 ©2012 MFMER | slide-5 [Millwood et al. 2012]

6 SHARPn High-Throughput Phenotyping Electronic health records (EHRs) driven phenotyping EHRs are becoming more and more prevalent within the U.S. healthcare system Meaningful Use is one of the major drivers Overarching goal To develop high-throughput automated techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings ©2012 MFMER | slide-6

7 SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-7 http://gwas.org

8 SHARPn High-Throughput Phenotyping EHR-driven Phenotyping Algorithms - I Typical components Billing and diagnoses codes Procedure codes Labs Medications Phenotype-specific co-variates (e.g., Demographics, Vitals, Smoking Status, CASI scores) Pathology Imaging? Organized into inclusion and exclusion criteria ©2012 MFMER | slide-8

9 SHARPn High-Throughput Phenotyping Data Transform EHR-driven Phenotyping Algorithms - II Phenotype Algorithm Visualization Evaluation NLP, SQL Rules Mappings [eMERGE Network] ©2012 MFMER | slide-9

10 SHARPn High-Throughput Phenotyping Example: Hypothyroidism Algorithm No secondary causes (e.g., pregnancy, ablation) No ICD-9s for Hypothyroidism No Abnormal TSH/FT4 No Antiboides for TTG/TPO ICD-9s for Hypothyroidism A ntibodies for TTG or TPO (anti-thyroglobulin, anti-thyroperidase) Abnormal TSH/FT4 No thyroid-altering medications (e.g., Phenytoin, Lithium) Thyroid replace. meds Case 1Case 2 No thyroid replace. meds Control 2+ non-acute visits in 3 yrs No hx of myasthenia gravis ©2012 MFMER | slide-10 [Denny et al., 2012]

11 SHARPn High-Throughput Phenotyping Hypothyroidism Algorithm: Validation Positive Predictive Values (PPV) Based on Chart Review – All Sites Site EHR-based Cases/Controls Sampled for Chart Review Cases/Controls Old Case PPV (%) New Case PPV (%) Group Health430/1,18850/509298 Marshfield509/119350/508891 Mayo Clinic250/2,145100/1007697 Northwestern103/51650/508898 Vanderbilt184/1,34450/509098 All sites1,421/6,362— 8796 ©2012 MFMER | slide-11 [Denny et al., 2012]

12 Data Categories used to define the EHR-driven Phenotyping Algorithms Clinical gold standard EHR-derived phenotype Phenotype Definitions Validation (PPV/NPV) Alzheimer’s Dementia Demographics, clinical examination of mental status, histopathologic examination Diagnoses, medications Demographics, laboratory tests, radiology reports 73% Cataracts Clinical exam finding (Ophthalmologic examination) Diagnoses, procedure codes Demographics, medications 98%/98% Peripheral Arterial Disease Radiology test results (ankle-brachial index or arteriography) Diagnoses, procedure codes, medications, radiology test results Demographics 94%/99% Type 2 Diabetes Laboratory Tests Diagnoses, laboratory tests, medications Demographics, height, weight, family history 98%/100% Cardiac Conduction ECG measurementsECG report resultsDemographics, diagnoses, procedure codes, medications, laboratory tests 97% [eMERGE Network] ©2012 MFMER | slide-12

13 SHARPn High-Throughput Phenotyping Genotype-Phenotype Association Results 0.55.01.0 Odds Ratio rs2200733Chr. 4q25 rs10033464Chr. 4q25 rs11805303IL23R rs17234657Chr. 5 rs1000113Chr. 5 rs17221417NOD2 rs2542151PTPN22 rs3135388DRB1*1501 rs2104286IL2RA rs6897932IL7RA rs6457617Chr. 6 rs6679677RSBN1 rs2476601PTPN22 rs4506565TCF7L2 rs12255372TCF7L2 rs12243326TCF7L2 rs10811661CDKN2B rs8050136FTO rs5219KCNJ11 rs5215KCNJ11 rs4402960IGF2BP2 Atrial fibrillation Crohn's disease Multiple sclerosis Rheumatoid arthritis Type 2 diabetes disease gene / region marker 2.0 [Ritchie et al. 2010] observedpublished ©2012 MFMER | slide-13

14 SHARPn High-Throughput Phenotyping Key lessons learned from eMERGE Algorithm design and transportability Non-trivial; requires significant expert involvement Highly iterative process Time-consuming manual chart reviews Representation of “phenotype logic” for transportability is critical Standardized data access and representation Importance of unified vocabularies, data elements, and value sets Questionable reliability of ICD & CPT codes (e.g., billing the wrong code since it is easier to find) Natural Language Processing (NLP) is critical ©2012 MFMER | slide-14

15 SHARPn High-Throughput Phenotyping Data Transform Algorithm Development Process - Modified Phenotype Algorithm Visualization Evaluation NLP, SQL Rules Mappings Semi-Automatic Execution [eMERGE Network] ©2012 MFMER | slide-15

16 SHARPn High-Throughput Phenotyping Data Transform Algorithm Development Process - Modified Phenotype Algorithm Visualization Evaluation NLP, SQL Rules Mappings Semi-Automatic Execution ©2012 MFMER | slide-16 Standardized representation of clinical data Create new and re-use existing clinical element models (CEMs) Standardized representation of clinical data Create new and re-use existing clinical element models (CEMs) Standardized and structured representation of phenotype definition criteria Use the NQF Quality Data Model (QDM) Standardized and structured representation of phenotype definition criteria Use the NQF Quality Data Model (QDM) Conversion of structured phenotype criteria into executable queries Use JBoss® Drools (DRLs) Conversion of structured phenotype criteria into executable queries Use JBoss® Drools (DRLs) [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012]

17 SHARPn High-Throughput Phenotyping The SHARPn “phenotyping funnel” ©2012 MFMER | slide-17 Phenotype specific patient cohorts DRLsQDMsCEMs [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012] Intermountain EHR Mayo Clinic EHR

18 SHARPn High-Throughput Phenotyping Clinical Element Models Higher-Order Structured Representations ©2012 MFMER | slide-18 [Stan Huff, IHC]

19 SHARPn High-Throughput Phenotyping Pre- and Post-Coordination ©2012 MFMER | slide-19 [Stan Huff, IHC]

20 SHARPn High-Throughput Phenotyping [Stan Huff, IHC] CEMs available for patient demographics, medications, lab measurements, procedures etc.

21 ©2012 MFMER | slide-21 SHARPn data normalization flow - I CEM MySQL database with normalized patient information [Welch et al. 2012]

22 SHARPn High-Throughput Phenotyping SHARPn data normalization flow - II ©2012 MFMER | slide-22 CEM MySQL database with normalized patient information

23 SHARPn High-Throughput Phenotyping Data Transform Algorithm Development Process - Modified Phenotype Algorithm Visualization Evaluation NLP, SQL Rules Mappings Semi-Automatic Execution ©2012 MFMER | slide-23 Standardized representation of clinical data Create new and re-use existing clinical element models (CEMs) Standardized representation of clinical data Create new and re-use existing clinical element models (CEMs) Standardized and structured representation of phenotype definition criteria Use the NQF Quality Data Model (QDM) Standardized and structured representation of phenotype definition criteria Use the NQF Quality Data Model (QDM) [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012]

24 SHARPn High-Throughput Phenotyping Our task: human readable  machine computable ©2012 MFMER | slide-24 [Thompson et al., submitted 2012]

25 SHARPn High-Throughput Phenotyping NQF Quality Data Model (QDM) Standard of the National Quality Forum (NQF) A structure and grammar to represent quality measures in a standardized format Groups of codes in a code set (ICD-9, etc.) "Diagnosis, Active: steroid induced diabetes" using "steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)” Supports temporality & sequences AND: "Procedure, Performed: eye exam" > 1 year(s) starts before or during "Measurement end date" Implemented as set of XML schemas Links to standardized terminologies (ICD-9, ICD-10, SNOMED-CT, CPT-4, LOINC, RxNorm etc.) ©2012 MFMER | slide-25

26 SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-26 116 Meaningful Use Phase I Quality Measures

27 SHARPn High-Throughput Phenotyping Example: Diabetes & Lipid Mgmt. - I ©2012 MFMER | slide-27 Human readable HTML

28 SHARPn High-Throughput Phenotyping Example: Diabetes & Lipid Mgmt. - II ©2012 MFMER | slide-28 Computable XML

29 SHARPn High-Throughput Phenotyping NQF Measure Authoring Tool (MAT) ©2012 MFMER | slide-29

30 SHARPn High-Throughput Phenotyping Data Transform Algorithm Development Process - Modified Phenotype Algorithm Visualization Evaluation NLP, SQL Rules Mappings Semi-Automatic Execution ©2012 MFMER | slide-30 Standardized representation of clinical data Create new and re-use existing clinical element models (CEMs) Standardized representation of clinical data Create new and re-use existing clinical element models (CEMs) Standardized and structured representation of phenotype definition criteria Use the NQF Quality Data Model (QDM) Standardized and structured representation of phenotype definition criteria Use the NQF Quality Data Model (QDM) Conversion of structured phenotype criteria into executable queries Use JBoss® Drools (DRLs) Conversion of structured phenotype criteria into executable queries Use JBoss® Drools (DRLs) [Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012]

31 SHARPn High-Throughput Phenotyping JBoss® open-source Drools rules based management system (RBMS) ©2012 MFMER | slide-31 Represents knowledge with declarative production rules Origins in artificial intelligence expert systems Simple when then rules specified in text files Separation of data and logic into separate components Forward chaining inference model (Rete algorithm) Domain specific languages (DSL)

32 SHARPn High-Throughput Phenotyping Example Drools rule ©2012 MFMER | slide-32 rule "Glucose <= 40, Insulin On“ when $msg : GlucoseMsg(glucoseFinding 0 ) then glucoseProtocolResult.setInstruction(GlucoseInstructions. GLUCOSE _LESS_THAN_40_INSULIN_ON_MSG); end {binding} {Java Class} {Class Getter Method} Parameter {Java Class} {Class Setter Method} {Rule Name}

33 SHARPn High-Throughput Phenotyping Automatic translation from NQF QDM criteria to Drools Measure Authoring Toolkit Measure Authoring Toolkit Drools Engine From non-executable to executable Data Types XML-based structured representation Data Types XML-based structured representation Value Sets saved in XLS files Value Sets saved in XLS files Measures XML-based Structured representation Measures XML-based Structured representation Mapping data types and value sets Mapping data types and value sets Fact Models Fact Models Converting measures to Drools scripts Converting measures to Drools scripts Drools scripts Drools scripts ©2012 MFMER | slide-33 [Li et al., submitted 2012]

34 SHARPn High-Throughput Phenotyping Automatic translation from NQF QDM criteria to Drools ©2012 MFMER | slide-34 [Li et al., submitted 2012]

35 The “executable” Drools flow ©2012 MFMER | slide-35

36 ©2012 MFMER | slide-36 Phenotype library and workbench - I 1.Converts QDM to Drools 2.Rule execution by querying the CEM database 3.Generate summary reports http://phenotypeportal.org

37 ©2012 MFMER | slide-37 Phenotype library and workbench - II http://phenotypeportal.org

38 ©2012 MFMER | slide-38 Phenotype library and workbench - III http://phenotypeportal.org

39 SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-39 Phenotype library and workbench - IV

40 SHARPn High-Throughput Phenotyping

41 Additional on-going research efforts - I Machine learning and association rule mining Manual creation of algorithms take time Let computers do the “hard work” Validate against expert developed ones ©2012 MFMER | slide-41 [Caroll et al. 2011]

42 SHARPn High-Throughput Phenotyping Additional on-going research efforts - I Origins from sales data Items (columns): co-morbid conditions Transactions (rows): patients Itemsets: sets of co-morbid conditions Goal: find all itemsets (sets of conditions) that frequently co-occur in patients. One of those conditions should be DM. Support: # of transactions the itemset I appeared in Support({TB, DLM, ND})=3 Frequent: an itemset I is frequent, if support(I)>minsup Patien t TBDL M ND…IEC 001YYYY 002YYYY 003YY 004Y 005YYY X: infrequent [Simon et al. 2012] ©2012 MFMER | slide-42

43 SHARPn High-Throughput Phenotyping Additional on-going research efforts - II ©2012 MFMER | slide-43

44 SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-44 TRALI/TACO sniffer Additional on-going research efforts - II

45 SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-45

46 SHARPn High-Throughput Phenotyping Active Surveillance for TRALI and TACO Of the 88 TRALI cases correctly identified by the CART algorithm, only 11 (12.5%) of these were reported to the blood bank by the clinical service. Of the 45 TACO cases correctly identified by the CART algorithm, only 5 (11.1%) were reported to the blood bank by the clinical service. ©2012 MFMER | slide-46

47 Additional on-going research efforts - III Phenome-wide association scan (PheWAS) Do a “reverse GWAS” using EHR data Facilitate hypothesis generation ©2012 MFMER | slide-47 [Pathak et al. submitted 2012]

48 SHARPn High-Throughput Phenotyping Publications till date (conservative) ©2012 MFMER | slide-48

49 SHARPn High-Throughput Phenotyping Mayo projects and collaborations Ongoing Transfusion related acute lung injury (Kor) Drug induced liver injury (Talwalkar) Drug induced thrombocytopenia and neutropenia (Al-Kali) Active surveillance for celiac disease (Murray) Warfarin dose response & heartvalve replacements (Pereira) Phenotype definition standardization (HCPR/Quality) Getting started/planning Pharmacogenomics of systolic heart failure (Bielinski/Pereira) Pharmacogenomics of SSRI (Mrazek/Weinshilboum) Lumbar image reporting with epidemiology (Kallmes) Active clinical trial alerting (CTMS/Cancer Center) ©2012 MFMER | slide-49

50 SHARPn High-Throughput Phenotyping HTP related presentations June 11 th, 2012 Using EHRs for clinical research (Vitaly Herasevich) Association rule mining and T2D risk prediction (Gyorgy Simon) Scenario-based requirements engineering for developing EHR add-ons to support CER in patient care settings (Junfeng Gao) June 12 th, 2012 Exploring patient data in context clinical research studies: Research Data Explorer (Adam Wilcox et al.) Utilizing previous result sets as criteria for new queries with FURTHeR (Dustin Schultz et al.) Semantic search engine for clinical trials (Yugyung Lee) Knowledge-driven workbench for predictive modeling (Peter Haug et al.) Clinical analytics driven care coordination for 30-day readmission – Demonstration from 360 Fresh.com (Ramesh Sairamesh) ©2012 MFMER | slide-50

51 SHARPn High-Throughput Phenotyping Thank You! ©2012 MFMER | slide-51 Pathak.Jyotishman@mayo.edu


Download ppt "Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Jyotishman Pathak, PhD."

Similar presentations


Ads by Google