(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Loyola University Chicago Health Sciences Division Stritch School of Medicine (SSOM) Loyola University Chicago Health Sciences Division Stritch School of Medicine (SSOM) The Clinical Research Database (CRDB) January 8, 2014
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Ron Price Associate Dean, Office of Information Systems Loyola University Chicago Stritch School of Medicine Maywood, Illinois & Associate Vice President, Informatics and Systems Development Loyola University Chicago Health Science Division Speaker:
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved What is the Clinical Research Database (CRDB)? Large-scale, de-identified clinical data warehouse structured to support a wide range of clinical analytics Operates on advanced Hadoop technology CRDB data are accessible via a web-based front-end for casual users (e.g., faculty, housestaff and students) and via a wide range of tools for advanced users (e.g., analysts, bioinformatics staff, etc.) Initial target data loads for the CRDB are from Epic (1/1/2007-9/30/2013)
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Why use Hadoop? Developed by Yahoo in mid-2000’s and is extensively utilized by “big-data” internet companies (and the NSA) to process large amounts (petabytes) of structured and unstructured data. Hadoop is a data management/processing framework that distributes data storage and processing over clusters of inexpensive computers Hadoop’s strengths are its ability to scale and to efficiently handle unstructured data (e.g., text reports, images, BLOBs, etc.) SSOM’s Hadoop environment – –Development and Production environments – –Production environment 12-node cluster (2 namenodes, MySQL srv, and 9 datanodes) 178TBs of storage (current core Epic EMR is 4TBs)
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Why use Hadoop? Hadoop’s strengths are its ability to scale and to efficiently handle unstructured data (e.g., text reports, images, BLOBs, etc.) “Of the 1.2 billion clinical documents produced in the United States each year, approximately 60 percent contain valuable information trapped in unstructured documents that are unavailable for clinical use, quality measurement and data mining.”* Some estimates put this number closer to 80% * Health Management Technology – June 2012
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Why not just use Epic? Epic is LUMC’s EMR however most data originates and are stored their native (e.g., granular or structured) formats in local ancillary systems (e.g., Clinical Labs, RIS/PACS, EPS, etc.) Epic is optimized for healthcare operations and not for research or population studies Activity related to large-scale analytics impacts system performance The “10,000 table” issue (actually 11,964! tables) Systems supporting research and population studies need – –Flexibility to handle “foreign” (e.g., external, multi-center) data – –Flexibility to handle unstructured data – –Need ability to scale to “big data” levels
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved CRDB Version 1.0 (July 2013) Current data – –De-identified with keys held in Epic Clarity data warehouse – –Data source of Epic Clarity (updated nightly) – –Data period of 1/1/2007 through 09/30/2013 – –Updated quarterly (next update mid-March 2014) – –Data tables Demographics Encounters (Inpatient, Outpatient, ED, Obs and home health) Procedures and clinical lab values Flowsheet measures (vitals, physical findings, etc.) Medications Payor information at encounter level CRDB application is widely available on the portal
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Demonstration of the CRDB CRDB
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved CRDB Version Future Website development activities – –Request for expedited IRB – –Refinement of “groupings” for ICD9s, CPTs and providers Capture of additional data (Current calendar year) – –Microbiology results and other report text blobs End-user Query Tool – Additional query parameters and analysis modules – –Enhanced logic functions (January 2014) – –CPTs (March 2014) – –Labs (June 2014) – –Flowsheet measures (August 2014) – –Units (October 2014)
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Current Usage (July 2013 – Dec 2013) Unique CRDB Users – 213 Query Tool CRDB Cohort identifications – 302 CRDB Data Extracts (since August) – –5 large clinical extracts for a recent PCORI grant – –Large data extract for Chicago Health Atlas project – –2 QI projects – –6 Medical Student/Resident clinical research projects
(C) 2010 Copyright - Loyola University Chicago Stritch School of Medicine, All Rights Reserved© Copyright 2013 – Loyola University Chicago Stritch School of Medicine – All Rights Reserved Questions and Answers