Download presentation
Presentation is loading. Please wait.
Published byLaureen Green Modified over 10 years ago
1
AMIA CRI Summit 2011 CRI-09: Cross-Institutional Systems to Support Phenotyping in Biomedical Research: Experiences from the eMERGE Network Luke Rasmussen Marshfield Clinic David Carrell, PhD Group Health Research Institute William Thompson, PhD Northwestern University Hua Xu, PhD Vanderbilt University Jyoti Pathak, PhD Mayo Clinic
2
eMERGE Consortium Principal sponsor: NHGRI with additional funding from NIGMS NIH-funded consortium (CTSA awardee institutions) DNA Biobanks linked to EHR data Consortium members –Group Health of Puget Sound –Marshfield Clinic –Mayo Clinic –Northwestern University –Vanderbilt University
3
QRS duration Dementia Peripheral vascular disease CataractsType II diabetes Coordinating center
4
Marshfield Clinic Biobank Population Geographically defined cohort Stable population Minimal selection bias Over 95% of medical events captured in EMR Data All levels of inpatient and outpatient care 5 decades of retrospective clinical data Prospective & continuous data collection via EHR Event, testing, treatment and outcomes represented High utilization of primary care to classify controls Clinical, financial and environment data Health Events
5
eMERGE Contributors NHGRI –Rongling Li –Heather Junkins –Teri Manolio –Jim Ostell Group Health –Eric Larson –Gail Jarvik –Chris Carlson –Wylie Burke –Gene Jart –David Carrell –Malia Fullerton –Walter Kukull –Paul Crane –Noah Weston Northwestern –Rex Chisholm –Bill Lowe –Phil Greenland –Wendy Wolf –Maureen Smith –Geoff Hayes –Pedro Avila –Joel Humowiecki –Jen Allen-Pacheco –Amy Lemke –Will Thompson Marshfield –Cathy McCarty –Peggy Peissig –Luke Rasmussen –Marilyn Ritchie –Justin Starren –Russ Wilke –Dick Berg –Jim Linneman Mayo Clinic –Christopher G. Chute –Iftikhar J. Kullo –Barbara Koenig –Suzette Bielinski –Mariza de Andrade Vanderbilt –Dan Roden –Dan Masys –Josh Denny –Brad Malin –Ellen Wright Clayton –Dana Crawford –Jonathan Haines –Jonathan Schildcrout –Jill Pulley –Melissa Basford –Marilyn Ritchie
6
RFA HG-07-005: Genome-Wide Studies in Biorepositories with Electronic Medical Record Data 2007 NIH Request for Applications from the National Human Genome Research Institute “The purpose of this funding opportunity is to provide support for investigative groups affiliated with existing biorepositories to develop necessary methods and procedures for, and then to perform, if feasible, genome-wide studies in participants with phenotypes and environmental exposures derived from electronic medical records, with the aim of widespread sharing of the resulting individual genotype-phenotype data to accelerate the discovery of genes related to complex diseases.” (Emphasis added)
7
Development and Growth Idea Develop Disseminate More Ideas Issues Pre-existing and new systems/methods Applied to common (yet different) tasks Different locations/ environments
8
Tools and Methods PresenterTopic Luke Rasmussen Marshfield Clinic Reusable phenotype algorithms Techniques to facilitate future reuse of phenotype algorithms. David Carrell Group Health Clinical Text Explorer Search Interface Facilitates exploration of EHR for rapid phenotyping and algorithm refinement. William Thompson Northwestern University clinical Text Analysis and Knowledge Extraction System (cTAKES) Natural language processing (NLP) system utilized for multiple phenotypes, including PAD. Hua Xu Vanderbilt University MedEx NLP system utilized within eMERGE with additional applications to pharmacogenomic research. Jyoti Pathak Mayo Clinic eleMAP Facilitates harmonization and standardization of phenotype variables across sites.
9
AMIA CRI Summit 2011 Reusable Phenotype Algorithms Luke Rasmussen Senior Programmer/Analyst Marshfield Clinic Research Foundation Biomedical Informatics Research Center
10
Phenotype Development Multi-disciplinary teams Multiple sites Iterative Intangible →Tangible
11
EMR-based Phenotype Algorithms Typical components –Billing and diagnoses codes –Procedure codes –Labs –Medications –Phenotype-specific co-variates (e.g., Demographics, Vitals, Smoking Status, CASI scores) –Pathology –Imaging? Organized into inclusion and exclusion criteria
12
EMR-based Phenotype Algorithms Iteratively refine case definitions through partial manual review to achieve ~PPV ≥ 95% For controls, exclude all potentially overlapping syndromes and possible matches; iteratively refine such that ~NPV ≥ 98%
13
Primary Phenotypes SitePhenotypeValidation (PPV/NPV) Group HealthDementia73% / 92% Marshfield Clinic Cataracts / Low HDL98% / 98% 82% / 96% Mayo ClinicPAD94% / 99% Northwestern University Type 2 DM98% / 100% Vanderbilty University QRS Duration97% / 100%
14
Supplemental Phenotypes SitePhenotypeValidation (PPV/NPV) Group HealthWBC* Marshfield Clinic Diabetic Retinopathy 80% / 98% Mayo ClinicRBC98% / 94% Northwestern University Lipids / Height92% / 100% 95% / 100% Vanderbilty University PheWAS* * - Not available at this time
15
Phenotype Reuse T2DM Diabetic Retinopathy –Identification of DM –T2DM included T1DM for exclusion Low HDL Lipids
16
Phenotype Reuse T2DM Diabetic Retinopathy
17
Iterative Refinement for Reuse Condition - Subtype ACondition - Subtype B Condition Subtype A Subtype B
18
Formalizing Reuse Identified potential for reuse Leverage significant work Phenotypes available: www.gwas.org Limitations –Site-specific implementations
19
Impressions Easy to do Fits with eMERGE goals Can fit retrospectively Prospective mindset
20
AMIA CRI Summit 2011 Thank You Luke Rasmussen rasmussen.luke@mcrf.mfldclin.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.