Case Study for Clinical Relevancy: Asthma Scott T. Weiss, M.D., M.S. BRIGHAM AND WOMEN’S HOSPITAL HARVARD MEDICAL SCHOOL Professor of Medicine Harvard.

Slides:



Advertisements
Similar presentations
Allison Dunning, M.S. Research Biostatistician
Advertisements

COPD Analyses Updated – 7th February February 2011.
Obtaining The Numbers Behind the Translational Imperative Harvard Medical School Center for Biomedical Informatics i2b2 National Center for Biomedical.
12 June 2004Clinical algorithms in public health1 Seminar on “Intelligent data analysis and data mining – Application in medicine” Research on poisonings.
Area 4 SHARP Face-to-Face Conference Phenotyping Team – Centerphase Project Assessing the Value of Phenotyping Algorithms June 30, 2011.
From Bedside to Bench and Back
Identifying COPD in primary care: targeting patients at the highest risk What is COPD? Chronic obstructive pulmonary disease or COPD is a long-term inflammatory.
Department of Health and Human Services Measuring Clinical Lab Ordering Quality: Theory and Practice Steven M. Asch MD MPH VA, RAND, UCLA April 29, 2005.
PEBB Disease Burden Report PEBB Board of Directors August 21, 2007 Bdattach.10.
1 Lauren E. Finn, 2 Seth Sheffler-Collins, MPH, 2 Marcelo Fernandez-Viña, MPH, 2 Claire Newbern, PhD, 1 Dr. Alison Evans, ScD., 1 Drexel University School.
RACIAL DISPARITIES IN PRESCRIPTION DRUG UTILIZATION AN ANALYSIS OF BETA-BLOCKER AND STATIN USE FOLLOWING HOSPITALIZATION FOR ACUTE MYOCARDIAL INFARCTION.
Applied Epidemiology Epidemiology of Chronic Obstructive Pulmonary Disease (COPD) By Chris Callan 23 April 2008.
EVIDENCE BASED MEDICINE
Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.
HIGH DOSES OF VITAMIN D TO REDUCE EXACERBATION IN CHRONIC OBSTRUCTIVE PULMONARY DISEASE: A RANDOMIZED TRIAL An Lehouck, PhD; Chantal Mathieu, MD, PhD;
Stefan Schulz, Thorsten Seddig, Susanne Hanser, Albrecht Zaiß, Philipp Daumke Checking coding completeness by mining discharge summaries.
Elizabeth Karlson, MD Associate Professor of Medicine
Progress with the literature reviews for the CHOICE programme Chris Dickens.
Algorithmic Approaches To Personalized Health Care Principal Investigators: I. Paschalidis and W. Adams at Boston Univ., collaborative with D. Bertsimas.
Memorial Hermann Healthcare System Clinical Integration & Disease Management Dan Wolterman April 15, 2010.
COPDGene® : Genetic Epidemiology of COPD
PHAT: The Pharmacogenetics of Asthma Treatment Channing Laboratory, Brigham and Women’s Hospital and Harvard Medical School University of Maryland School.
Quality of Life and Depression as Determinants of Treatment Adherence in Hypertensive Leonelo E. Bautista 1 ; Paul Smith 2 ; Cynthia Colombo 2 ; Dennis.
Measuring Output from Primary Medical Care, with Quality Adjustment Workshop on measuring Education and Health Volume Output OECD, Paris 6-7 June 2007.
Surveillance for Asthma: Measuring a Moving Target David M. Mannino Air Pollution and Respiratory Health Branch Centers for Disease Control and Prevention.
Pattern of Diabetes Emergencies among adult Yemeni Diabetic Patients Dr. Zayed Atef Faculty of Medicine Sana’a University.
HPC for Biomed Applications Marcos Athanasoulis, Dr.PH Director, Information Technology Harvard Medical School.
Results of the 3 pilot studies conducted near waste incinerators in Dorog, Forlí and Warsaw.
10 Points to Remember on the Assessment of Cardiovascular RiskAssessment of Cardiovascular Risk Summary Prepared by Melvyn Rubenfire, MD.
Anticipated FY2016 Appropriations Agency$ Million NIH200 Cancer70 Cohort130 FDA10 Office of the Natl Coord. for Health IT (ONC) 5 TOTAL215 Mission: To.
Thank you for viewing this presentation. We would like to remind you that this material is the property of the author. It is provided to you by the ERS.
WMMC Symposium. Centers For Disease Control What Is Chronic Obstructive Pulmonary Disease (COPD)?  COPD is the name for a group of diseases that restrict.
Long-term exposure to air pollution and asthma hospitalisations in older adults: a cohort study Zorana Jovanovic Andersen ERS Conflict of interest.
Underdiagnosis of Pediatric Hypertension – An Example of the Potential of Electronic Medical Record Research for Clinical Pediatricians David C Kaelber,
Innovations and Challenges in Coordinated Care for Chronically ill Children John M. Neff, M.D. Professor of Pediatrics University of Washington School.
Risk Assessment Farrokh Alemi, Ph.D.. Session Objectives 1.Discuss the role of risk assessment in the TQM process. 2.Describe the five severity indices.
Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.
Information Technology and Data Collection: February 28, 2008 Optimizing Lab Results and Pharmacy Data Collection Under P4P Concurrent Session 1.07 Horace.
Do Interruptions in Medicaid Coverage Increase the Risk of Preventable Hospitalizations? Andrew B. Bindman,MD Arpita Chattapadhyay PhD Glenna Auerback,
GOLD Update 2011 Rabab A. El Wahsh, MD. Lecturer of Chest Diseases and Tuberculosis Minoufiya University REVISED 2011.
Adverse Outcomes After Hospitalization and Delirium in Persons with Alzheimer Disease Charles Wang, PharmD Candidate.
The Association between blood glucose and length of hospital stay due to Acute COPD exacerbation Yusuf Kasirye, Melissa Simpson, Naren Epperla, Steven.
A Claims Database Approach to Evaluating Cardiovascular Safety of ADHD Medications A. J. Allen, M.D., Ph.D. Child Psychiatrist, Pharmacologist Global Medical.
Hospital racial segregation and racial disparity in mortality after injury Melanie Arthur University of Alaska Fairbanks.
COPD Diagnosis & Management Anil Ramineni Specialist Respiratory Physiotherapist Community Respiratory Team.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Integrated Management of Childhood Illnesses
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
Efficacy of Combination First Line Agents for Smoking Cessation Sneha Baxi, Pharm.D. Pharmacy Practice Resident University of Illinois at Chicago.
Daniel B. Jamieson, Elizabeth C. Matsui, Andrew Belli1, Meredith C. McCormack, Eric Peng Simon Pierre-Louis, Jean Curtin-Brosnan, Patrick N. Breysse, Gregory.
Acute Renal Failure in HIV- Infected Individuals Greatly Increases Risk for In-Hospital Mortality Slideset on: Wyatt CM, Arons RR, Klotman PE, Klotman.
Introduction to Lifestyle data Nicola Bowtell
Department of Preventive Medicine Faculty of Public Health University of Debrecen General Practitioners’ Morbidity Sentinel Stations Program (GPMSSP) to.
CHEST 2014; 145(4): 호흡기내과 R3 박세정. Cigarette smoking ㅡ the most important risk factor for COPD in the US. low value of FEV 1 : an independent predictor.
Chapter 7: Epidemiology of Chronic Diseases. “The Change You Like to See….” (1 of 3) Chronic diseases result from prolongation of acute illness. – With.
The Impact of Disability on Depression Among Individuals With COPD Patricia P. Katz, PhD ; Laura J. Julian, PhD ; Theodore A. Omachi, MD, MBA ; Steven.
Quality Measurement A Changing Landscape
Alcohol, Other Drugs, and Health: Current Evidence July–August 2017
. Troponin limit of detection plus cardiac risk stratification scores for the exclusion of myocardial infarction and 30-day adverse cardiac events in ED.
BRONCHIAL ASTHMA YOUSEF ABDULLAH AL TURKI MBBS,DPHC,ABFM
Trends in Use of Pulmonary Rehabilitation Among Older Adults with Chronic Obstructive Pulmonary Disease Anita C. Mercado, Shawn P. Nishi, Wei Zhang, Yong-Fang.
Introduction to translational and clinical bioinformatics Connecting complex molecular information to clinically relevant decisions using molecular.
Walden University Carrie Vanzant February 7, 2010
Fenglong Ma1, Jing Gao1, Qiuling Suo1
Dr. Muhammad Ajmal Zahid Chairman, Department of Psychiatry,
Martijn Schuemie, Peter Rijnbeek, Jenna Reps, Marc Suchard
Selecting the Right Predictors
PowerPoint 16:9 Screen Ratio Template *
Dan Geschwind, MD, PhD Director, Institute for Precision Health
Presentation transcript:

Case Study for Clinical Relevancy: Asthma Scott T. Weiss, M.D., M.S. BRIGHAM AND WOMEN’S HOSPITAL HARVARD MEDICAL SCHOOL Professor of Medicine Harvard Medical School Director, Center for Genomic Medicine Director, Program in Bioinformatics Associate Director, Channing Laboratory Brigham and Women’s Hospital Boston, MA

Outline Context: focus on process and data Overview of Asthma DBP Smoking as an example of the data issues Predicting COPD in those with asthma Predicting asthma exacerbations Genetic prediction of asthma exacerbations current status DNA collection Lessons Learned Conclusions

Context Channing Lab - extensive genetics & pharmacogenetics resources focused on airways diseases Faculty with clinical, epidemiology, genetic, and bioinformatics training and experience multidisciplinary research collaborative track record Good i2b2 driver: from bench to clinic Strong focus and direction for Cores

Broad Goals of Channing Program in Predictive Medicine Genetic variation  clinical practice  Disease risk (asthma diagnosis)  Natural history (exacerbations)  Individual response to medication (pharmacogenetics) Develop predictive tests (genetic and nongenetic) in Channing populations Validate these tests in Partners asthma cohort (PAC) at least as proof of concept

I2B2 Airways DBP: Overview RPDR Partners Clinical Services Extract data from Airways Disease patients Extract relevant quantitative and coded phenotypes Extract important phenotypes from text: NLP Predict clinical outcomes after adjustment for covariates RPDR: Recruit, validate, genotype Develop statistical models

Before we start Numerous important covariates e.g. age, tobacco, comorbidities, medications Adjust outcomes for covariates Some (eg age, gender,Dx, encounter) readily available Obtained through Core 4 Others require substantial effort e.g. medications, tobacco use, comorbid conditions Collaboration - NLP experts in Core 1

Phenotypes from text Extract specific data items –Medication –Smoking status –Diagnoses (Co-morbidity) Extract findings to assist with case selection Extract findings to assist with clinical predictions

Smoking Status- Examples HOSPITAL COURSE:... It was recommended that she receive …We also added Lactinax, oral form of Lactobacillus acidophilus to attempt a repopulation of her gut. SH: widow,lives alone,2 children,no tob/alcohol. BRIEF RESUME OF HOSPITAL COURSE: 63 yo woman with COPD, 50 pack-yr tobacco (quit 3 wks ago), spinal stenosis,... SOCIAL HISTORY: Negative for tobacco, alcohol, and IV drug abuse. SOCIAL HISTORY: The patient is a nonsmoker. No alcohol. SOCIAL HISTORY: The patient is married with four grown daughters, uses tobacco, has wine with dinner. Smoker Non-Smoker SOCIAL HISTORY: The patient lives in rehab, married. Unclear smoking history from the admission note… Past Smoker ??? Hard to pick

Smoking -Text Processing 952 Past smoker 427 Never smoked 146 Denies smoking Cases per class 50No. Attributes 261 Control cases 1010 Current Smoker 5No.Classes 2796No. Cases Manually classified

Smoking Status Raw sample ~ 20,000 reports Feature extraction >3000 Feature selection “Gold standard” sample cases ~ 2,800 Correct classification rate % (compared to Gold Standard) Preliminary results

Smoking Status CV 10xNaïve BayesStemmed one-gram CV 10xNaïve BayesStemmed one-gram Split 2/3Naïve BayesBi-gram Split 2/3SVMBi-gram Split 2/3Naïve BayesOne-gram No. Features Split 2/3 Test Cases Naïve Bayes SVM Classification Method 79.70One-gram More … 65.05Tri-gram 44.63Tri-gram % Correctly Classified Data Set Increase, combine features should improve performance Baseline performance Preliminary results

Feature Analysis  Classification  Clustering  Statistical Analysis  … Data Mining Pipeline “Raw” Patient Data Text Processing  Word/pattern filters  Stemming  Lexicon matching  Parsing  … Data Extraction “Smart Data”  Medications  Smoking status  Co-morbidity

Asthma Preceding COPD Significant overlap of asthma and COPD DX Common denominator = smoking Asthma is known to precede and predict the development of COPD independent of smoking Could we develop a multivariate clinical predictor that would predict which asthmatics would get COPD?

Study Design Source: Partners Healthcare Research Patient Data Repository (RPDR). RPDR: MGH, BWH, etc clinical repository for researchers. Training: 9349 asthmatics (843 COPD, 8506 controls) first encounter Test: A future set of 992 asthmatics (46 COPD, 946 controls) first encounter from

Data Collection Criteria: Patients observed for at least 5 years, at least 18 at the first encouter, and race, sex, height, weight, and smoking available. Comorbodities: International Classification of Diseases, 9th Revision (ICD-9) codes as admission diagnosis or ER primary diagnosis (104) COPD: ICD-9 code for “Chronic Bronchitis”, “Emphysema” “Chronic Airways Obstruction, not otherwise specified.”

Analysis Model: A Bayesian network was generated from the training set of 9349 asthmatics (843 COPD, 8506 controls) encountered between1988 and 1998 from 104 comoribities and race, gender, age, smoking. Results: The risk of COPD is modulated by gender, race, and smoking history, and 14 comorbidities: Viral and chlamydial infections, diabetes mellitus, volume depletion, acute myocardial infarction, intermediate coronary syndrome, cardiac dysrhythmias, heart failure, acute upper respiratory infections, acute bronchitis and bronchiolitis, pneumonia, early or threatened labor, normal delivery, shortness of breath, respiratory distress.

Network Model

Validation Propagation: a Bayesian network can compute the probability distribution of any variable given an instance of some or all the other variables. Test data: a future set of 992 asthmatics (46 COPD, 946 controls) first encounter from Prediction: for each patient, predict the probability of COPD given the other elements in the network (co- morbidities and demographics). Validation: compare the predicted with the observed COPD status.

Predictive Validation

One variable at the time

Asthma Exacerbations Asthma attacks involve worsening of asthma symptoms including bronchoconstriction and inflammatory response Major cause of morbidity and mortality in asthma 11.7 million Americans have an exacerbation every year (3.9 million children) In US children, exacerbations are the third leading cause of hospitalizations (198,000 occurrences per year) Cost of asthma exacerbations US=4 billion dollars, Partners=20 million dollars

RPDR Exacerbation Prediction

Genetic Prediction of Asthma Exacerbation Objective Predict asthma exacerbation from genetic data Subjects 290 CAMP participants Not on steroids Followed for 10+ years Have genetic data available Phenotype Case: Reported overnight hospitalization(s) (n=83) Control: No overnight hospitalizations or ER visits (n=207) Genotype 2443 SNPs from 349 candidate genes In Hardy-Weinberg equilibrium among controls Minor allele frequency > 0.05

Exacerbation Model 132 of 2443 SNPs in 55 of 349 genes predict exacerbation

Validation Method: Prediction on fitted values Result: Area under the ROC curve (AUROC) is 0.97 AUROC = 0.97 AUROC measures accuracy as trade-off between sensitivity and specificity AUROCRating Fail Poor Fair Good Excellent

Cross-Validation Method: 20-fold cross-validation to test robustness 1.Data is split into 20 groups 2.One group is used as independent and remaining 19 are used to quantify the model 3.(2) is repeated until each group has been independent set Result: AUROC is 0.84 (good) AUROC = 0.84

Partners Asthma DNA collection #1 Recruit Partners asthma patients Partners Asthma Center, NWH, MGH High quality spirometric phenotyping Blood for DNA extraction and storage Children and adults High cost (>$1000/subject) Low intensity 6 months only 100 subjects recruited Doctors and patients need education

Partners Asthma DNA collection #2 Recruit Partners asthma cohort patients Leverage CRIMSON blood samples Leverage data mart for phenotype data Blood for DNA extraction and storage Children and adults cases and controls low cost (<$30/subject) High intensity 9 months >3000 subjects recruited

Figure 1 Data Flow for Asthma DBP ChanningRPDR ADMPN# Send to RPD converts ADMPN# to MRN sends to pathology Pathology (Crimson) MRN Crimson ID# ADMPN sends back to Channing with sample for DNA extraction Figure 1 Legend Deidentified data file analyzed by Channing subjects for DNA collection selected. File sent to RPDR converted back to MR# and sent to Crimson. Samples identified and given Crimson ID# ≡ ADMPN and sample Sent back to Channing.

Recruitment for DBP from Crimson at BWH: Asthma Cases by Utilization and Race

Recruitment for DBP from Crimson at BWH: Asthma Cases and Controls by Race

Summary of Samples to 04/07/08 59High Caucasian: 880Controls African American: 222Low African American: 1,341Controls Caucasian: 454Low Caucasian: 111High African American: Running total:

Lessons learned 1 Get what you ask for Regular meetings, regular meetings Negotiate your demands Tools are not enough Leverage your peers Recruiting patients is hard work IRB is hard work

Lessons learned 2 You can never have enough statistics or bioinformatics Genotyping and its technologies are secondary The RPDR data are dirty! Listen to Shawn Be flexible

Summary: Airways disease as a driver for i2b2 “Typical” complex disease challenge Big impact on health care system Potential for large clinical impact Core 1: Extracting phenotypes from free text; statistical models Core 2: Viewer for CRC Core 4: Data provisioning

Conclusions The stronger the existing program, the more successful the I2B2 collaboration Communication is key Fit the question to the data not the other way around Data access will be an issue for the future

Collaborators (and what they did) Scott, Zak, John, and Susanne: money, project management, IRB, and big picture Ross: Channing bioinformatics, file structures, geek to geek translation with the cores, beta testing, 850 collection, IRB, links to other genetic bioinformatics tools and projects Shawn and Vivian: asthma and control data mart Anne, LJ, James: nongenetic predictors in CAMP Marco and Blanca: nongenetic predictors in PAC Marco and Blanca: genetic predictors in CAMP Marco and Blanca: genetic predictors in PAC Lynn: Crimson

Acknowledgments: Ross LazarusSusanne Churchill Blanca E. HimesAnne Fuhlbrigge Marco F. RamoniLJ Wei Isaac KohaneJames Sigornivitch Shawn MurphyLynn Bry