Bret J. Gardner University of Nebraska Medical Center Creating a Computable Phenotype for Pregnancy for Clinical Research Methods for Developing Clinical Phenotypes S114 Bret J. Gardner University of Nebraska Medical Center
Disclosure I and my wife have no relevant relationships with commercial interests to disclose. AMIA 2017 | amia.org
Learning Objectives After participating in this session the learner should be better able to: Apply a model to local clinical data research warehouses to predict pregnancy status of patients being screened for clinical research. Recognize the need for a prediction model for pregnancy status for clinical research. AMIA 2017 | amia.org
Presentation Overview Define “computable phenotype” Importance of developing a computable phenotype for pregnancy status Methods: Variable Identification and transformation Univariate analysis Planned Analysis: Temporal sampling Logistic Regression Model validation Conclusions and Future Directions AMIA 2017 | amia.org
Computable Phenotype – Definition Phenotype – “measurable biological […], behavioral […], or cognitive markers that are found more often in individuals with a […] condition than in the general population.” Computable Phenotype – “a clinical condition or characteristic that can be ascertained via a computerized query to an EHR system or clinical data repository using a defined set of data elements and logical expressions.” Richesson R, Smerek M. Electronic health records-based phenotyping. Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials 2015. AMIA 2017 | amia.org
Computable Phenotype – Benefits Identify populations of interest for observational and interventional research. Increasing number of distributed research networks using federated queries for population identification Standardized computable phenotypes may transcend institutional boundaries ensuring reliability and reproducibility for intra- or inter- network pragmatic trials. No consistent approach to identifying pregnancy status exists AMIA 2017 | amia.org
Computable Phenotype for Pregnancy Critical to know a patient’s pregnancy status in the clinic and for clinical research Historically, accurately assessing pregnancy status from the electronic health record (EHR) has been challenging1 In an analysis of computerized physician order entry (CPOE) and decision support systems, Metzger et al noted 85% of adverse drug events related to pregnancy were not detected2 1. Kuperman GJ, Bobb A, Payne TH, Avery AJ, Gandhi TK, Burns G, et al. Medication-related clinical decision support in computerized provider order entry systems: a review. J Am Med Inform Assoc 2007 Jan-Feb;14(1):29-40. 2. Metzger J, Welebob E, Bates DW, Lipsitz S, Classen DC. Mixed results in the safety performance of computerized physician order entry. Health Aff (Millwood) 2010 Apr;29(4):655-663. AMIA 2017 | amia.org
Computable Phenotype for Pregnancy Many laboratory, ultrasonographic, and physical methods exist to assess pregnancy Varying levels of sensitivity, specificity, and reliability Some of these data may be queried in the EHR Are you pregnant now? AMIA 2017 | amia.org
Challenges Pregnancy is a temporally bounded condition of unpredictable duration Pregnancy may occur zero, one, or many times for a given patient No standard data recording practice or standard data elements being employed AMIA 2017 | amia.org
Hypothesis Data exist and are accessible in the EHR sufficient to reliably and reproducibly identify current pregnancy status of a patient. A valid logistic discriminant model for pregnancy status can be created by identifying a validated population who have experienced pregnancy and age- matched control group and assessing the frequency and predictive power of available facts from the EHR. AMIA 2017 | amia.org
Variable Identification Pregnant cohort and non- pregnant control group identified Clinical data queried using i2b2* Frequency analysis identified candidate variables for consideration Clinician review pared list to those with specificity and clinical relevance *Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc 2010 Mar-Apr;17(2):124-130. AMIA 2017 | amia.org
Cohort Identification and Characterization Pregnant (N = 3,422) Control (N = 55,361) Min. Age 14 15 Max. Age 47 50 Median ± SD 29 ± 5.7 33 ± 9.6 Cohort Characterization AMIA 2017 | amia.org
AMIA 2017 | amia.org
Univariate Analysis Results Significant difference observed for all variables between the pregnant cohort and the control population AMIA 2017 | amia.org
Univariate Analysis – Continued Pregnancy Tests AMIA 2017 | amia.org
Conclusions and Future Directions We have identified a set of discriminant variables, readily available within an EHR, which may be employed to compute probability of concurrent pregnancy status Planned Analysis: Temporal sampling of data from cohort and control populations Use multiple logistic regression techniques to identify the most parsimonious and valid model for predicting pregnancy status Test and deploy predictive model across the Greater Plains Collaborative (GPC) research network AMIA 2017 | amia.org
Planned Analysis and Future Directions Temporal Sampling Monte Carlo methods for sampling throughout the calendar year Create and compare multiple logistic regression models from these samples Variable Definitions Temporally define variables (i.e. Did this patient have a positive pregnancy test recorded in the last 8 months? Did this patient have an obstetric ultrasound within the past 7 months?) Interested in current pregnancy state, not historic episodes. AMIA 2017 | amia.org
Future Directions Survey GPC sites for presence of validated variables in their data warehouse Clinical workflow and recording practices likely vary Test logistic regression model across this distributed research network to validate extensibility. AMIA 2017 | amia.org
Ashok Mudgapallia and the UNMC Research IT Office (RITO) Acknowledgements James R. Campbell, MD Teresa G. Berg, MD Jay G. Pedersen, MA James C. McClay, MD, MS Ashok Mudgapallia and the UNMC Research IT Office (RITO) AMIA 2017 | amia.org
Questions When considering developing a shareable computable phenotype for pregnancy status, which of the following best summarizes some of the major challenges? There is a dearth of recorded data elements relative to a pregnancy episode in the EHR. Pregnancy episodes are transient, a large variety of clinical findings may indicate a pregnancy, and these data elements are not consistently recorded across EHRs. There are no major challenges and a variety of computable phenotypes have been effectively employed in the past to identify a pregnant population. Standardized coding systems, such as ICD, SNOMED, LOINC, and CPT, are ill suited for clearly defining pregnancy status and must be revised prior to an effective computable phenotype being constructed and distributed. AMIA 2017 | amia.org
Answer There is a dearth of recorded data elements relative to a pregnancy episode in the EHR. Pregnancy episodes are transient, a large variety of clinical findings may indicate a pregnancy, and these data elements are not consistently recorded across EHRs. There are no major challenges and a variety of computable phenotypes have been effectively employed in the past to identify a pregnant population. Standardized coding systems, such as ICD, SNOMED, LOINC, and CPT, are ill suited for clearly defining pregnancy status and must be revised prior to an effective computable phenotype being constructed and distributed. Explanation: There is an abundance of coded material recorded from a variety of clinical encounters regarding pregnancy episodes and pregnancy status. ICD, CPT, LOINC, and SNOMED data elements abound with details on supervision of pregnancy, pregnancy tests, and procedures related to delivery. In the past, ad hoc approaches have been taken to identifying a pregnant population with very limited success. Answering the question of current rather than only historic pregnancy status is a major obstacle, especially when coupled with the variety of information that can be recorded in a variety of locations within EHRs. Reference: Kuperman GJ, Bobb A, Payne TH, Avery AJ, Gandhi TK, Burns G, et al. Medication- related clinical decision support in computerized provider order entry systems: a review. J Am Med Inform Assoc 2007 Jan-Feb;14(1):29-40. AMIA 2017 | amia.org
Questions Which of the following best describes a computable phenotype for clinical research? A set of clinical findings or characteristics attainable through the EHR that may be processed by a computer to define a population of interest. The set of observable characteristics defined by genotype and influenced by environment for an organism or individual. The diagnosis reached by an inference engine based on clinician input and logic. Clinical decision support based on clinical findings from the HER to increase patient safety. AMIA 2017 | amia.org
Answer A set of clinical findings or characteristics attainable through the EHR that may be processed by a computer to define a population of interest. The set of observable characteristics defined by genotype and influenced by environment for an organism or individual. The diagnosis reached by an inference engine based on clinician input and logic. Clinical decision support based on clinical findings from the EHR to increase patient safety. Explanation: A computable phenotype is a computer readable definition for a set of characteristics or conditions that define a population of interest. This may be a specific disease, pregnancy status, or other set of criteria. It should be able to be processed by a computer and shared with other researchers to define the same population. Clinical decision support is a logical application of a well- made computable phenotype. The set of observable characteristics defined by genotype and influenced by environment is the definition of phenotypes in general. These are not necessarily able to be processed by a computer or relevant to clinical research. Diagnostic engines are another application of technology for patient safety, however, a computable phenotype defines a population rather than diagnosing an individual. Reference: Richesson R, Smerek M. Electronic health records-based phenotyping. Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials 2015. AMIA 2017 | amia.org
AMIA is the professional home for more than 5,400 informatics professionals, representing frontline clinicians, researchers, public health experts and educators who bring meaning to data, manage information and generate new knowledge across the research and healthcare enterprise. AMIA 2017 | amia.org
Email me at: bret.gardner@unmc.edu Thank you! Email me at: bret.gardner@unmc.edu