Research Data Analytics at Thomas Jefferson University Jack London, PhD Thomas Jefferson University Sidney Kimmel Cancer Center Philadelphia PA USA 2015.

Slides:

Advertisements

Similar presentations

Regulation of Consumer Tests in California AAAS Meeting June 1-2, 2009 Beatrice OKeefe Acting Chief, Laboratory Field Services California Department of.

Advertisements

CRITICAL APPRAISAL ON AN ARTICLE ABOUT PROGNOSIS

Bill Stockdale, MBA, Celeste Beck, MPH, Lisa Hulbert, PharmD, Wu Xu, PhD Utah Department of Health Comparison with other methods of analysis: 1) Assessing.

Enterprise Use Cases. Levels LevelDescriptionExamples 0 0aVerbal CommunicationNon-permanent, e.g. verbal communication 1Non-electronic dataMail, phone.

Modified Megestrol The Clinical Trials by : Carolina R. Akib

Quality Cancer Data The Vital Role of Cancer Registrars in the Fight against Cancer Saves Lives.

Overview of Biomedical Informatics Rakesh Nagarajan.

University of Pittsburgh Department of Biomedical Informatics Healthcare institutions have established local clinical data research repositories to enable.

CUMC IRB Investigator Meeting November 9, 2004 Research Use of Stored Data and Tissues.

Oakland University William Beaumont School of Medicine.

BTRIS: The NIH Biomedical Translational Research Information System James J. Cimino Chief, Laboratory for Informatics Development NIH Clinical Center.

BTRIS: The NIH Biomedical Translational Research Information System James J. Cimino Chief, Laboratory for Informatics Development NIH Clinical Center.

Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.

Breast Cancer 101 Barbara Lee Bass, MD, FACS Professor of Surgery

Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.

Giving Induction Radiation in Addition to Chemotherapy Is Not Associated with Improved Survival of NSCLC Patients with Operable Mediastinal Nodal Disease.

Research & Innovation Horizon societal challenge 1 Open Info Day Funding Opportunities for SMEs Horizon 2020 "Health, demographic change and wellbeing"

Quality Cancer Data The Vital Role of Cancer Registrars in the Fight against Cancer Saves Lives.

Clinical & Translational Science: Individualizing Cancer Immunotherapy Scott A. Waldman, MD, PhD, FCP Thomas Jefferson University.

1 The UK Opportunity: what is experimental medicine? UNLOCK YOUR GLOBAL BUSINESS POTENTIAL Pre- clinical development Phase I Phase II Phase III Product.

1 Research Data Marts In Support Of Cancer Personalized Medicine Jack London, PhD and Devjani Chatterjee, PhD Jefferson Kimmel Cancer Center, Philadelphia.

Clinical Trials, TCGA: Deep Integrative Research RT, Imaging, Pathology, “omics” Joel Saltz MD, PhD Director Center for Comprehensive Informatics.

SYNOPSIS OF THE PROTOCOL Title: Pregnancy Associated Breast Cancer (PABC); Prospective Data Registry in Saudi Arabia Sponsor: Oncology Department, King.

Assessing Minority Participation in Clinical Trials: Setting Attainable Goals The Minority and Women Clinical Trials Recruitment Program Department of.

1 The Cancer Consortium Deborah Schrag, MD (PI) Caprice Christian Greenberg, MD, MPH Brigham and Women’s Hospital Dana-Farber Cancer Institute.

The use of human biospecimens in cancer research Christopher A. Moskaluk M.D., Ph.D. University of Virginia.

1 Targeted Strategies to Improve Physician and Patient Access to Cancer Clinical Research An Overview of Coalition Programs and Services Ruth Lambersky.

Annual prostate cancer symposium February 23, 2013 The Kimmel Cancer Center, Philadelphia, PA 2nd “ Novel Therapeutic Strategies for Prostate Cancer ”

1 Jack London, PhD Research Professor, Cancer Biology Thomas Jefferson University Informatics Shared Resource Director Sidney Kimmel Cancer Center at Jefferson.

MRI-Ultrasound Fusion-Guided Biopsy of the Prostate: Results of Initial Experience in a Radiation Oncology Department Department of Radiation Oncology.

CANCER INCIDENCE IN NEW JERSEY BY COUNTY, for the Comprehensive Cancer Control Plan County Needs Assessments August 2003 Prepared by: Cancer.

ACRIN 6685 Overview ACRIN 6685 A Multi-center Trial of FDG-PET/CT Staging of Head and Neck Cancer and its Impact on the N0 Neck Surgical Treatment in Head.

Building the Electronic Data Infrastructure: Lessons from Indiana PROSPECT Paul Dexter, MD Chief Medical Information Officer, Wishard Health Services Regenstrief.

BIOMARKERS Diagnostics and Prognostics. OMICS Molecular Diagnostics: Promises and Possibilities, p. 12 and 26.

Integrated Data Management System for the Biorepository.

Treatment Summary University of California San Francisco Center of Excellence for Breast Cancer Care PI: Laura J Esserman MD MBA; Edward Mahoney; Elly.

Gynecologic Oncology Group Gynecologic Oncology Group (GOG) Sharon Stockman, C-CRC The University of Iowa Hospitals Iowa City, IA Chair, GOG Data Management.

SARC: Participation and Protocol / Concept Review Robert Maki, MD PhD Memorial Sloan-Kettering Cancer Center.

Facilitate Scientific Data Sharing by Sharing Informatics Tools and Standards Belinda Seto and James Luo National Institute of Biomedical Imaging and Bioengineering.

Data Sources-Cancer Betsy A. Kohler, MPH, CTR Director, Cancer Epidemiology Services New Jersey Department of Health and Senior Services.

Developing medicines for the future and why it is challenging Angela Milne.

The Use of Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

October 9 th, 2015 University of Pennsylvania TIES Cancer Research Network Y3 Face to Face Meeting U24 CA Session 3 New Partner Introductions.

Enrollment and Monitoring Procedures for NCI Supported Clinical Trials Barry Anderson, MD, PhD Cancer Therapy Evaluation Program National Cancer Institute.

CTRP User Call May 7, 2014 Gene Kraus CTRP Program Director.

The UWCCC/WON Molecular Tumor Board (MTB) & Registration Protocol Mark E. Burkard MD PhD WON fall meeting October 24, 2015.

Treatment Patterns in the Management of Prostate Cancer: Lessons Learned from the Florida Cancer Data System Vonetta L. Williams, PhD, MPH, CTR June 23,

AccrualNet: A New NCI Tool for Supporting Accrual to Clinical Trials Linda Parreco, RN, MS NCI, Office of Communication and Education September 29, 2010.

The TJU Human Research Protection Program (HRPP): Part I – Which Entities/Offices are Involved ? J. Bruce Smith, MD, CIP.

S1207: Phase III Randomized, Placebo-Controlled Clinical Trial Evaluating the Use of Adjuvant Endocrine Therapy +/- One Year of Everolimus in Patients.

ROAD MAP: Getting a Cancer Study Done at Jefferson Sylvia O’Neill, MD Associate Director of Regulatory Affairs and Quality Assurance Clinical Trials Office.

Informatics Tools and Services Biomedical Informatics Core Tim Aro.

Uses of the NIH Collaboratory Distributed Research Network Jeffrey Brown, PhD for the DRN Team Harvard Pilgrim Health Care Institute and Harvard Medical.

Linking Electronic Health Records Across Institutions to Understand Why Women Seek Care at Multiple Sites for Breast Cancer Caroline A. Thompson, PhD,

Pediatric Oncology Data Collection in Multi-Center Studies Pediatric Oncology Networked Database (POND) International Outreach Program International Outreach.

The Bridge from Patient to Scientist Comparison: BioBank and Cancer Registry Data Source Distinct Patients Percent BioBank % Cancer Registry %

San Antonio Breast Cancer Symposium – December 6-10, 2016

Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.

NCT: Gaining Medical Insights and Enhancing Care for Cancer Patients with SAP HANA® Organization National Center for Tumor Diseases (NCT) Heidelberg, part.

Table 1: Patient Demographics

Accelerating Precision Medicine for Advanced Cancer Patients

Precision Medicine / Precision Health

Diagnostics and Prognostics

Integration of Genomic Medicine into Pathology Residency Training

Megan Eguchi, MPh Sana karam, md, phd

SYNOPSIS OF THE PROTOCOL

From Innovation to Commercialization Access to Data

UAB Tissue Biorepository

Dan Geschwind, MD, PhD Director, Institute for Precision Health

REACHnet: Research Action for Health Network

Presentation transcript:

Research Data Analytics at Thomas Jefferson University Jack London, PhD Thomas Jefferson University Sidney Kimmel Cancer Center Philadelphia PA USA 2015 i2b2 European Academic User Group meeting October 6, 2015

2 Disclaimer In addition to my faculty position at Thomas Jefferson University in Philadelphia, I am a consultant for TriNetX Corporation.

3 Thomas Jefferson University and the Sidney Kimmel Cancer Center (SKCC), Philadelphia Located between New York City and Washington DC Jefferson Medical College (JMC) was founded in JMC is the second largest private medical school in the U.S. The NCI-designated SKCC has ~ 400 physicians and scientists dedicated to discovery and development of novel approaches for cancer treatment.

4 SKCC’s IT infrastructure GE Centricity inpatient EMR Allscripts outpatient (ambulatory care) EHR EPIC inpatient and outpatient Cerner A/P lab system EPIC Beaker OpenSpecimen research biobank management TIES clinical text extraction i2b2 research data mart TriNetX data analytics network

5 Current Jefferson Data Resource Landscape TJUH CLINICAL DATA WAREHOUSE DEMOGRAPHICS (gender, race, age, vital status, ethnicity) DIAGNOSES (ICD9) PROCEDURES (ICD9) CLINICAL LABS (LOINC) MEDICATIONS TJUH CLINICAL DATA WAREHOUSE DEMOGRAPHICS (gender, race, age, vital status, ethnicity) DIAGNOSES (ICD9) PROCEDURES (ICD9) CLINICAL LABS (LOINC) MEDICATIONS i2b2 RESEARCH DATA MART IMPAC METRIQ cancer registry site, stage, histology, treatment, survival (ICD-O-3 ) IMPAC METRIQ cancer registry site, stage, histology, treatment, survival (ICD-O-3 ) CERNER A/P “omic” data CERNER A/P “omic” data FORTE ONCORE clinical trial data FORTE ONCORE clinical trial data OPEN SPECIMEN biospecimen annotation (SNOMED) OPEN SPECIMEN biospecimen annotation (SNOMED)

6 Jefferson’s i2b2 Research Data Mart Built on “informatics for integrating biology and the bedside” (i2b2) version RDM data are de-identified. Re-identification possible via an honest broker, who has access to a re-identification application. Currently > 45 million observations on > 450,000 patients. Data refreshed weekly.

7 Patient data obtained from TJUH EMR DEMOGRAPHICS Age Ethnicity Gender Race Vital Status (alive/dead) DIAGNOSES Disease systems --> diseases (organized by ICD9 coding) CLINICAL LAB RESULTS Chemistry Coagulation Hematology MEDICATIONS Anti-neoplastic INPATIENT PROCEDURES Diagnostic and Treatment procedures (organized by ICD9 coding)

8 Patient mutation data obtained from Pathology Molecular Diagnostic Testing (both outsourced and in-house) ALKrearrangement BRAFc.1782T>Gp.D594E BRAFc.1801A>Gp.K601E BRAFc.1799T>Ap.V600E EGFRDeletion in exon 19 EGFRInsertion in exon 20 EGFRc.2236G>Ap.E746K EGFRc.2236_2250del15 p.E746_A750delELREA EGFRc.2156G>Cp.G719A EGFRc.2155G>Tp.G719C EGFRc.2155G>Ap.G719S EGFRc.2573T>Gp.L858R EGFRc.2582T>Ap.L861Q EGFRc.2303G>Tp.S768I JAK2c.1849G>Tp.V617F JAK3c.2164G>Ap.V722I KRASc.35G>Cp.G12A KRASc.34G>Tp.G12C KRASc.35G>Ap.G12D KRASc.34G>Cp.G12R KRASc.34G>Ap.G12S KRASc.35G>Tp.G12V KRASc.38G>Ap.G13D NRASc.183A>Tp.Q61H NRASc.181C>Ap.Q61K NRASc.182A>Tp.Q61L NRASc.182A>Gp.Q61R PIK3CAc.1633G>Ap.E545K PIK3CAc.3140A>Tp.H1047L PIK3CAc.3140A>Gp.H1047R PTENc.754G>Tp.D252Y PTENc.59G>Ap.G20E RETrearrangement ROS1rearrangement SMAD4c.1157G>Ap.G386D TP53c.843C>Ap.D281E TP53c.811G>Tp.E271* TP53c.857A>Cp.E286A TP53c.400T>Cp.F134L TP53c.734G>Ap.G245D TP53c.388C>Gp.L130V TP53c.524G>Ap.R175H TP53c.817C>Tp.R273C TP53c.818G>Ap.R273H TP53c.318C>Gp.S106R TP53c.659A>Gp.Y220C TP53c.707A>Gp.Y236C

9 Molecular Diagnostics ontology

10 Specimen annotation from campus biobanks Anatomic origin (SNOMED) Class (tissue, fluid) Type (frozen, FFPE) Pathology (normal, malignant, diseased) Slide images Eight biobanks, including the TJUH paraffin block archive of ~400,000 cases since 1990.

11 Specimen annotation management TJUH clinical paraffin block archive Pathology Department research tissue bank Brain tumor bank (J. Evans, PI) Pancreatic tumor bank (C. Yeo, PI) Breast tumor bank (J. Palazzo, PI) Thyroid tumor bank (E. Pribitkin, PI) Brain tumor bank (D. Andrews, PI) Liver tumor bank (V. Navarro, PI) JJJjjjj Jefferson integrated Research Specimen management (OpenSpecimen) > 230,000 patients > 650,000 specimens > 100,000 patients via i2b2 RDM Cancer patients having comprehensive annotation from the Tumor Registry and banked specimens

12 Biospecimen ontology

13 Pathology images are available via i2b2 query tool

14 Patient data from Jefferson Tumor Registry Primary Cancer Diagnosis Age at diagnosis/date of diagnosis Survival (months) from diagnosis Tumor histology and behavior Stage (AJCC/TNM, clinical and pathological) Grade Recurrence local, distant Treatment chemotherapy, radiation, surgery, transplant, palliative Disease-specific factors ex: (prostate --> Gleason score) Over 100,000 cases since 1990.

15 Tumor Registry ontology

16 Typical SKCC Investigator Queries Example #1: Form cohort of “triple negative” (estrogen receptor, progesterone receptor, and her2 negative), African American patients, having matched normal and malignant frozen tissue specimens. Example #2: Form cohort of patients with a primary diagnosis of papillary thyroid cancer, and expressing a V600E BRAF mutation.

17 Additional data on selected cohort can be retieved

18 Example data summaries from the i2b2 RDM CLINICAL DIAGNOSES OF TJUH PATIENTS WITH THYROID SPECIMENS

19 Jefferson – TriNetX project In the fall of 2014, the SKCC informatics group entered into a collaboration with a Cambridge, Massachusetts based start-up company, TriNetX, Inc. TriNetX facilitates collaboration between pharmaceutical companies and academic healthcare providers through the creation of a global, federated data network that connects academic and industry clinical researchers in real-time to the patient populations they are attempting to study. The TriNetX applications accesses a site’s i2b2 database, and displays aggregate query results in an advanced, flexible manner.

20 TriNetX application offers an alternative query tool with enhanced data visualization Google-like query interface Graphic result display

21 TriNetX application offers an alternative query toolwith enhanced data visualization Interactive display capability

22 Cohort definition via i2b2 can be used to predict accrual for proposed clinical trials

23 Problem confronting clinical trials research: studies that fail to accrue An Institute of Medicine report 1 on cancer cooperative group trials found that 40% were never completed because of failure to achieve minimum accrual goals: “The ultimate inefficiency is a clinical trial that is never completed because of insufficient patient accrual, and this happens far too often.” These non-accruing trials are often kept open for many months before closure, consuming personnel resources in their setup and operation at a significant cost to institutions, without providing any return in definitive research findings. Furthermore, while many of these trials register zero patients, others accrue some patients, resulting in thousands of patients nationwide who are recruited to unproductive research studies Nass SJ, Moses HL, Mendelsohn J, editors. Committee on Cancer Clinical Trials and the NCI Cooperative Group Program Board on Health Care Services; A National Cancer Clinical Trials System for the 21st Century: Reinvigorating the NCI Cooperative Group Program. Washington DC: National Academies Press, Cheng, S., M. Dietrich, S. Finnigan, A. Sandler, J. Crites, L. Ferranti, A. Wu, and D. Dilts. A sense of urgency: Evaluating the link between clinical trial development time and the accrual performance of CTEP-sponsored studies ASCO Annual Meeting Proceedings. J of Clinical Oncology, 2009.

24 Study design The overall objective of this study was to evaluate whether accrual for proposed cancer clinical trials could be predicted by performing cohort queries that are based on the trial’s eligibility criteria on recent patient data in Jefferson’s i2b2 research data mart (RDM), created from de-identified integrated hospital clinical, tumor registry, and specimen data. To determine the ability of the i2b2 RDM to predict accrual for prospective trials, we retrospectively used the RDM to obtain patient populations for two years prior to recent trials and compared these cohort sizes to the actual accrual observed after the trial was opened. We considered 90 interventional cancer trials opened at KCC in the years 2008, 2009, and 2010, since these have been open for at least two years and their accrual performance could be evaluated.

25 Study methodology o We constructed RDM cohort queries corresponding to the trial eligibility criteria for the two years prior to each trial’s opening (e.g., we considered TJUH patient populations from 2007 and 2008 for trials opened in 2009). o We computed an annual cohort size by averaging the 2-year totals. o We then compared our RDM annual cohort size for the 2 years preceding a trial’s opening to the annual target goal for that trial and the trial’s actual accrual performance. Since we initially assumed that 50% of eligible participants would enroll in a study, the RDM cohort would have to be at least twice the accrual goal for a prediction of “successful” trial accrual. We defined a trial’s actual accrual performance as “successful” if it accrued at least 80% of its target enrollment.

26 Results To assess the predictive precision of our proposed project, a contingency table was produced for the 90 trials analyzed. A trial was denoted as potentially successful in meeting its annual target accrual (“PREDICTED SUCCESS” row) if the retrospective i2b2 cohort analysis indicated sufficient patients for the trial. A trial was denoted as actually successful in meeting its annual target accrual if the trial satisfactorily approached the protocol’s stated target annual accrual (“ACTUAL SUCCESS” column). Contingency table comparing i2b2 accrual predictions with actual accrual success, assuming only 50% of potential participants identified by i2b2 are enrolled. Our methodology has (= 31/32 trials) accuracy (95% C.I. (0.908, 1)) for predicting successful accrual (i.e. specificity) and 0.397(= 23/58 trials) accuracy (95% C.I. (0.271, 0.522)) for predicting failed accrual (i.e. sensitivity). The positive predictive value, or precision rate, is (= 23/24 trials) (95% C.I. (0.878, 1)).

27 Results Our results show that the methodology, while having an excellent positive predictive value (95.8%, predicted failure for 23 of the 24 trials that actually failed ), is not good at predicting failed accrual (39.7%, 23/58 trials). In other words: if the methodology predicts "failed accrual," then we should trust this prediction and should not proceed to open the trial with its current eligibility criteria; however, a prediction of accrual success using this method is no guarantee that target goals will be met.

28 How can this methodology be useful? A benefit of analyzing potential trial accrual during the protocol design phase is that it offers an opportunity to “tweak” eligibility rules when insufficient patient cohorts are found. A change in participation criteria that does not impact significantly on the scientific objectives of the trial may provide a sufficiently large potential patient pool. Not opening the 23 trials that were correctly predicted to fail to accrue over the 3 years studied would have prevented the waste of about $200,000 in trial startup costs alone, and the participation of 57 patients in studies which did not contribute to advancing science or clinical care.

Selected areas of research using RDM: Hallgeir Rui, MD, PhD: Molecular Cancer Epidemiology, cancer pharmacogenetics, individualised cancer risk assessment and prognostication. Raphael E. Bonita, MD: Jefferson Heart Institute, correlation of troponin levels and heart failure in transplant patients. Hushan Yang, PhD: Molecular Cancer Epidemiology. Jordan Winter, MD: Surgery, whipple procedure survival study. Scott Waldman, MD, PhD: Pharmacology and experimental therapeutics. Ron Myers, PhD: Gene environmental risk assessmant. Stephen Peiper, MD: Biomarker discovery using Next Generation Sequencing.