Presentation is loading. Please wait.

Presentation is loading. Please wait.

Observational Health Data Sciences and Informatics (OHDSI): An International Network for Open Science and Data Analytics in Healthcare Patrick Ryan, PhD.

Similar presentations


Presentation on theme: "Observational Health Data Sciences and Informatics (OHDSI): An International Network for Open Science and Data Analytics in Healthcare Patrick Ryan, PhD."— Presentation transcript:

1 Observational Health Data Sciences and Informatics (OHDSI): An International Network for Open Science and Data Analytics in Healthcare Patrick Ryan, PhD Janssen Research and Development Columbia University Medical Center 30 May 2016

2 Odyssey (noun): \oh-d-si\
A long journey full of adventures A series of experiences that give knowledge or understanding to someone

3 A journey to OHDSI

4 What is the quality of the current evidence from observational analyses?
August2010: “Among patients in the UK General Practice Research Database, the use of oral bisphosphonates was not significantly associated with incident esophageal or gastric cancer” Sept2010: “In this large nested case-control study within a UK cohort [General Practice Research Database], we found a significantly increased risk of oesophageal cancer in people with previous prescriptions for oral bisphosphonates”

5 What is the quality of the current evidence from observational analyses?
April2012: “Patients taking oral fluoroquinolones were at a higher risk of developing a retinal detachment” Dec2013: “Oral fluoroquinolone use was not associated with increased risk of retinal detachment”

6 What is the quality of the current evidence from observational analyses?
BJCP May 2012: “In this study population, pioglitazone does not appear to be significantly associated with an increased risk of bladder cancer in patients with type 2 diabetes.” BMJ May 2012: “The use of pioglitazone is associated with an increased risk of incident bladder cancer among people with type 2 diabetes.”

7 What is the quality of the current evidence from observational analyses?
Nov2012: FDA released risk communication about the bleeding risk of dabigatran, based on unadjusted cohort analysis performed within Mini-Sentinel Dec2013: “This analysis shows that the RCTs and Mini-Sentinel Program show completely opposite results” Aug2013: “However, the absence of any adjustment for possible confounding and the paucity of actual data made the analysis unsuitable for informing the care of patients”

8 Lessons from the OMOP experiments
Database heterogeneity: Holding analysis constant, different data may yield different estimates Parameter sensitivity: Holding data constant, different analytic design choices may yield different estimates Empirical performance: Most observational methods do not have nominal statistical operating characteristics Empirical calibration can help restore interpretation of study findings Madigan D, Ryan PB, Schuemie MJ et al, American Journal of Epidemiology, 2013 “Evaluating the Impact of Database Heterogeneity on Observational Study Results” Madigan D, Ryan PB, Scheumie MJ, Therapeutic Advances in Drug Safety, 2013: “Does design matter? Systematic evaluation of the impact of analytical choices on effect estimates in observational studies” Ryan PB, Stang PE, Overhage JM et al, Drug Safety, 2013: “A Comparison of the Empirical Performance of Methods for a Risk Identification System” Schuemie MJ, Ryan PB, DuMouchel W, et al, Statistics in Medicine, 2013: “Interpreting observational studies: why empirical calibration is needed to correct p-values”

9 Lesson 5: Reliable evidence generation isn’t (just) a data/analysis/technology problem
Understanding the problems requires input and perspective from multiple stakeholders: government, industry, academia, health systems Research and development of novel solutions require multi-disciplinary approach: informatics, epidemiology, statistics, clinical sciences Adoption and application requires active participation and buy-in from all interested parties (both evidence producers and evidence consumers) Major outstanding need: to establish a community of individuals based on shared attitudes, interests and goals where everyone has equal opportunity to participate and contribute

10 What’s the core problem?
We have lots of DATA we’d like to learn from… ….and very little EVIDENCE we can actually trust

11 Why large-scale analysis is needed in healthcare
All health outcomes of interest All drugs

12 Introducing OHDSI The Observational Health Data Sciences and Informatics (OHDSI) program is a multi-stakeholder, interdisciplinary collaborative to create open-source solutions that bring out the value of observational health data through large-scale analytics OHDSI has established an international network of researchers and observational health databases with a central coordinating center housed at Columbia University

13 OHDSI’s mission To improve health, by empowering a community to collaboratively generate the evidence that promotes better health decisions and better care.

14 What evidence does OHDSI seek to generate from observational data?
Clinical characterization Natural history: Who are the patients who have diabetes? Among those patients, who takes metformin? Quality improvement: what proportion of patients with diabetes experience disease-related complications? Population-level estimation Safety surveillance: Does metformin cause lactic acidosis? Comparative effectiveness: Does metformin cause lactic acidosis more than glyburide? Patient-level prediction Precision medicine: Given everything you know about me and my medical history, if I start taking metformin, what is the chance that I am going to have lactic acidosis in the next year? Disease interception: Given everything you know about me, what is the chance I will develop diabetes?

15 What is OHDSI’s strategy to deliver reliable evidence?
Methodological research Develop new approaches to observational data analysis Evaluate the performance of new and existing methods Establish empirically-based scientific best practices Open-source analytics development Design tools for data transformation and standardization Implement statistical methods for large-scale analytics Build interactive visualization for evidence exploration Clinical evidence generation Identify clinically-relevant questions that require real-world evidence Execute research studies by applying scientific best practices through open-source tools across the OHDSI international data network Promote open-science strategies for transparent study design and evidence dissemination

16 OHDSI ongoing collaborative activities
Methodological research Open-source analytics development Clinical applications Observational data management Data quality assessment Common Data Model evaluation ATHENA for standardized vocabularies WhiteRabbit for CDM ETL Usagi for vocabulary mapping HERMES for vocabulary exploration ACHILLES for database profiling Clinical characterization Phenotype evaluation CIRCE for cohort definition CALYPSO for feasibility assessment HERACLES for cohort characterization Chronic disease therapy pathways Population-level estimation Empirical calibration LAERTES for evidence synthesis CohortMethod SelfControlledCaseSeries SelfControlledCohort TemporalPatternDiscovery HOMER for causality assessment Patient-level prediction Evaluation framework and benchmarking PatientLevelPrediction APHRODITE for predictive phenotyping PENELOPE for patient-centered product labeling

17 OHDSI in action #1: Establishing open community standards for observational data management

18 The journey of the OMOP Common data model
OMOP CDMv2 OMOP CDM now Version 5, following multiple iterations of implementation, testing, modifications, and expansion based on the experiences of the community who bring on a growing landscape of research use cases. OMOP CDMv4 OMOP CDMv5 Page 18

19 One model, multiple use cases
Drug safety surveillance Comparative effectiveness Device safety surveillance Health economics Vaccine safety surveillance Quality of care Clinical research Standardized clinical data Person Standardized health system data Standardized meta-data Observation_period Location Care_site CDM_source Specimen Provider Death Concept Payer_plan_period Standardized health economics Standardized vocabularies Vocabulary Visit_occurrence Domain Visit_cost Procedure_occurrence Concept_class Procedure_cost Concept_relationship Drug_exposure Drug_cost Relationship Device_exposure Concept_synonym Device_cost Concept_ancestor Condition_occurrence Standardized derived elements Cohort Source_to_concept_map Measurement Cohort_attribute Drug_strength Note Condition_era Cohort_definition Observation Drug_era Dose_era Attribute_definition Fact_relationship

20 OHDSI community in action
OHDSI Collaborators: >140 researchers in academia, industry, government, health systems >20 countries Multi-disciplinary expertise: epidemiology, statistics, medical informatics, computer science, machine learning, clinical sciences Databases converted to OMOP CDM within OHDSI Community: >50 databases >660 million patients

21 OHDSI in action #2: Clinical characterization of treatment pathways

22 ADA T2DM Guidelines, 2015

23 What treatments do patients in Canada actually use for their diabetes, hypertension, or depression?
OHDSI asked this question around the rest of the world…

24 Network process Join the collaborative
Propose a study to the open collaborative Write protocol Code it, run it locally, debug it (minimize others’ work) Publish it: Each node voluntarily executes on their CDM Centrally share results Collaboratively explore results and jointly publish findings

25 OHDSI in action: Chronic disease treatment pathways
Conceived at AMIA Protocol written, code written and tested at 2 sites Analysis submitted to OHDSI network Results submitted for 7 databases 15Nov Nov2014 2Dec2014 5Dec2014

26 OHDSI participating data partners
Code Name Description Size (M) AUSOM Ajou University School of Medicine South Korea; inpatient hospital EHR 2 CCAE MarketScan Commercial Claims and Encounters US private-payer claims 119 CPRD UK Clinical Practice Research Datalink UK; EHR from general practice 11 CUMC Columbia University Medical Center US; inpatient EHR 4 GE GE Centricity US; outpatient EHR 33 INPC Regenstrief Institute, Indiana Network for Patient Care US; integrated health exchange 15 JMDC Japan Medical Data Center Japan; private-payer claims 3 MDCD MarketScan Medicaid Multi-State US; public-payer claims 17 MDCR MarketScan Medicare Supplemental and Coordination of Benefits US; private and public-payer claims 9 OPTUM Optum ClinFormatics US; private-payer claims 40 STRIDE Stanford Translational Research Integrated Database Environment US; inpatient EHR HKU Hong Kong University Hong Kong; EHR 1 Hripcsak et al, PNAS, in press

27 Treatment pathways for diabetes
T2DM : All databases Only drug First drug Second drug Hripcsak et al, PNAS, in press

28 Population-level heterogeneity
Type 2 Diabetes Mellitus CCAE Hypertension CUMC Depression MDCD CPRD INPC GE JMDC MDCR OPTUM Hripcsak et al, PNAS, in press

29 OHDSI in action #3: Population-level estimation through open-source analytics

30 OHDSI’s approach to open science
Generate evidence Data + Analytics + Domain expertise Open source software Enable users to do something Open science is about sharing the journey to evidence generation Open-source software can be part of the journey, but it’s not a final destination Open processes can enhance the journey through improved reproducibility of research and expanded adoption of scientific best practices

31 Standardizing workflows to enable reproducible research
Open science Population-level estimation for comparative effectiveness research: Is <intervention X> better than <intervention Y> in reducing the risk of <condition Z>? Generate evidence Database summary Cohort definition Cohort summary Compare cohorts Exposure-outcome summary Effect estimation & calibration Compare databases Defined inputs: Target exposure Comparator group Outcome Time-at-risk Model specification Consistent outputs: analysis specifications for transparency and reproducibility (protocol + source code) only aggregate summary statistics (no patient-level data) model diagnostics to evaluate accuracy results as evidence to be disseminated static for reporting (e.g. via publication) interactive for exploration (e.g. via app)

32

33 Feasibility enabled in near-real-time through open-source applications

34 Community invitation to participate… that means you!

35 Protocol and source code made freely available to public BEFORE the analysis is completed

36 OHDSI in action #4: Patient-centered evidence dissemination

37 ALT TEXT: Baby in neonatal intensive care unit.

38 ALT TEXT: Product brochure for Synagis® (palivizumab)
ALT TEXT: Product brochure for Synagis® (palivizumab). The brochure lists the indication “used to help prevent serious lung disease caused by respiratory syncytial virus (RSV) in children at high risk for severe lung disease from RSV”. It also lists ‘select safety information’: “common side effects of Synagis include fever and rash. Other possible side effects include skin reactions around the area where the shot was given (like redness, swelling, warmth, or discomfort)”. Doctor X: “This paper says there’s side effects, but I’ve never seen them happen”

39 ALT TEXT: Screenshot of the first page of the FDA-approved product labeling for Synagis® - palivizumab injection. The label lists under the ‘Warnings and Precautions’ section that “Analyphylaxis and anaphylactic shock (including fatal cases), and other severe acute hypersensitivity reactions have been reported.”

40 Pediatric patients across US observational databases
source age group (at database entry) persons avg years of observation CCAE 1. newborn 0 to 27d 3,360,896 2.23 MDCD 1,862,651 1.74 Optum 1,473,940 1.82 2. infant and toddler 28d to 23mo 3,275,604 2.34 1,379,760 1.70 963,770 1.97 3. children 2 to 11 yo 14,904,293 2.46 4,037,836 1.77 4,951,888 2.18 4. adolescants 12 to 18 yo 12,218,224 2.41 2,565,515 1.57 3,805,609

41 Exploring palivizumab exposure and hypersensitivity in observational data
data source age group (at time of exposure) persons exposed follow-up time (yrs) persons with outcome risk (events / 1000 persons); 95%CI CCAE 1. newborn 0 to 27d 1839 4829 0 ( ) MDCD 381 760 0 ( ) OPTUM 2610 5916 1 0.38 ( ) 2. infant and toddler 28d to 23mo 42843 106320 27 0.63 ( ) 19910 41196 17 0.85 ( ) 22365 48632 11 0.49 ( ) 3. children 2 to 11 yo 544 1525 1.84 ( ) 265 706 0 ( ) 204 574 0 ( ) 4. adolescants 12 to 18 yo 38 93 0 ( ) 33 28 0 ( ) 44 0 ( ) Back of the envelope: Assuming CCAE+MDCD+OPTUM represents 10% of US and exposures are evenly distributed across ~1000 NICUs, doctor would have seen ~50 newborns with exposure... even if the true event rate was 1%, there’s >60% chance they’d never see one case

42 Enter PENELOPE Personalized Exploratory Navigation & Evaluation Of
Labels for Product Effects

43 Let’s Take a Look!

44 OHDSI: what does it mean to me?
Methodological research Open-source analytics development Clinical applications Observational data management Where is there reliable data about the health of children? Clinical characterization Who are the children who are exposed to palivizumab? Population-level estimation Does palivizumab cause anaphylaxis in newborns? ALT TEXT: OHDSI focused on four analytical use cases: observational data management (to answer questions like: ‘where is there reliable data about the health of children?’), clinical characterization (descriptive analyses to answer questions like: ‘who are the children who are exposed to furosemide?’ and ‘how often to children exposed to furosemide experience acute renal disorder?’), population-level estimation (analytical studies for drug safety surveillance and comparative effectiveness research that address causal inference questions estimating average treatment effects for questions like: ‘does furosemide cause nephrocalcinosis in newborns?’), and patient-level prediction (modeling individualized risk for questions like ‘will my daughter by the one to develop nephrocalcinosis?’). To address these analytical use cases, OHDSI is collaborating on three fronts: methodological research (to learn and evaluate scientific best practices), open-source analytics development (to embed those best practices learned into standardized, reproducible statistical routines), and clinical applications (to apply those best practices through standardized tools to important clinical questions that patients and providers want the answers to). Patient-level prediction Will my daughter be the one to develop anaphylaxis?

45 Concluding thoughts Observational databases can be a useful tool for generating evidence to important clinical questions in… Clinical characterization Population-level estimation Patient-level prediction …but ensuring that evidence is reliable requires developing scientific best practices, and transparent and reproducible processes to conduct analyses across the research enterprise An open science community allows all stakeholders to contribute to and benefit from a shared solution…anyone can get involved…that mean’s YOU! Every patient, caregiver, parent and child deserves to know what is known (and what remains uncertain) from the real-world experience of others in order to inform their medical decision-making

46 Interested in OHDSI? Questions or comments? ryan@ohdsi.org
Join the journey Interested in OHDSI? Questions or comments?


Download ppt "Observational Health Data Sciences and Informatics (OHDSI): An International Network for Open Science and Data Analytics in Healthcare Patrick Ryan, PhD."

Similar presentations


Ads by Google