Collaborative Research Groups and the DRN OC Keith Marsolo PCORnet DRN OC “Title Slide” format
Introduction The purpose of this webinar is to define PCORnet, specifically in terms of the Distributed Research Network In order to complete a picture of what types of questions are currently answerable, we will discuss: The data in the PCORnet Common Data Model (availability and completeness) Tools that exist to query the data Discuss: Plans and processes for additions to the CDM as part of version 3.2 Interaction of CRGs with the Front Door (FD) for query requests
PCORnet at a glance (as of February 2017) More than 110 million patients with an ambulatory visit, inpatient admission, or ED visit in the past 5 years More than 55 million patients with an ambulatory visit, inpatient admission, or ED visit in the past year Data standardized to the PCORnet Common Data Model at ~80 DataMarts Data are routinely updated on a quarterly basis Recency and completeness of data varies by DataMart Data generally available from 2010 onward (varies by DataMart and domain)
PCORnet is… A Distributed Research Network Network partners maintain possession of their own data Network partners provide aggregate results, not patient-level data, to the Coordinating Center for non-study queries Predominantly EHR data Standardized across diverse health systems Biased toward patients who seek care Highly heterogeneous Well-suited for counts & summary statistics
PCORnet may… (for a subset of DataMarts) Capture claims/insurance data (~10 DataMarts) Reflect integrated EHR and claims data A small number of partners have partially integrated EHR and claims data
PCORnet is not well-suited for… Population-based estimates of incidence and prevalence Selected sample of patients who seek at least some of their care from participating sites. Complete longitudinal data capture This is highly variable across partners and not well- described Researchers should be prepared to account for loss to follow-up and varying completion rates Identifying the absence of disease, exposures, or events
PCORnet Common Data Model (CDM)
PCORnet CDM A Common Data Model (CDM) is a way of organizing data into a standard structure. The CDM makes it easier for PCORnet networks to share information with each other by setting common definitions and organizing data so that: PCORnet can analyze data more quickly. Different database (RDBMS) platforms can be used by networks to organize their data (data also standardized in SAS to facilitate data analysis). Networks can analyze their data more easily and efficiently. Network currently operates on version 3.0 Version 3.1 was released in November and will be characterized in August The approach PCORnet is using for its CDM mirrors the approaches used by other large national research consortia.
PCORnet Common Data Model v3.0 15 PCORnet tables
Key Points About Each Table
Across all Tables Percent of DataMarts that populated the table does not mean that all of the fields in the table are populated (Appendix slides report missingness rates) There is significant diversity in how well-populated each field is within a table Keep in mind that all IDs (provider ID, patient ID, facility ID, etc.) are pseudoidentifiers National Provider Identifiers (NPIs) are not captured Statistics reflect data as of February 2017
Demographics Table All DataMarts have populated this table Hispanic and Race are highly missing, but are likely better populated after 2011 because of Meaningful Use standards* * Requirements were finalized Sept 2010, but different providers/professionals complied with MU Stage 1 at different times Key Fields Birth Date Sex Race Hispanic
Enrollment Table All DataMarts have populated this table Most DataMarts define enrollment by minimum and maximum encounter date for each patient; therefore this may not useful in the typical sense of an “enrollment period” Key Fields Start & End Date Basis (insurance, geography, algorithmic, encounter-based)
Encounter Table All DataMarts have populated this table Encounter types of primary interest are Ambulatory, ED and Inpatient 90% of DataMarts have all 3 of these encounter types Provider & Facility IDs (these are pseudoidentifiers; not NPIs) DRG is highly missing Key Fields Admit & Discharge Date Discharge Disposition (discharged alive or expired) Encounter type Discharge Status (e.g. nursing home, expired, hospice) Provider & Facility ID Diagnosis Related Group (DRG)
Diagnosis Table All DataMarts have populated this table Diagnosis coded as ICD-9 ICD-10 SNOMED (infrequently used) Principal flag Only relevant for inpatient encounters Well-populated by approximately 75% of DataMarts Key Fields Encounter type Principal diagnosis flag Diagnosis code
Procedures Table All DataMarts have populated this table Procedures coded as: NDC/LOINC/Revenue are allowable but generally not included May include: orders for lab tests Injectable/infused outpatient medications are sometimes captured in this table Key Fields Encounter type Procedure code Procedure source (billing, order, claim) ICD-9 CPT/HCPCS LOINC ICD-10 NDC Revenue
Vitals Table Approximately 95% of DataMarts have populated this table Smoking & Tobacco – likely better populated after 2011 due to Meaningful Use standards* * Requirements were finalized Sept 2010, but different providers/professionals complied with MU Stage 1 at different times Key Fields Height BMI Weight Smoking Diastolic & systolic blood pressure Tobacco
Medications – Prescribing Table Approximately 90% of DataMarts have populated this table Prescription orders are coded as RxCUIs and DataMarts maps their source data to these codes Multiple RxCUIs may exist for a given medication order. Preferred hierarchy is provided in the Implementation Guidance Dose information is not explicitly collected but is captured as part of the preferred RxCUI Uncertainty of the reliability of mapping at this level Prescription end date, frequency, and days supply are highly missing Includes orders for outpatient medication dispensing. May include orders for inpatient medication administration. Off-the-shelf tool to query this table is anticipated May-August 2017 Key Fields RxCUI Quantity Order date Refills Prescription start & end date Days supply Frequency
Medications – Dispensing Table Approximately 50% of DataMarts have populated this table Only captures outpatient medications* Likely incomplete data capture (DataMarts may only receive certain dispensing feeds) Dose information is not explicitly collected but is captured as part of the NDC Off-the-shelf tool to query this table is anticipated May-August 2017 Key Fields NDC Days supply Dispense date Amount *The PCORnet CDM does not currently capture inpatient medication administration
Lab_Result Table Approximate 90% of DataMarts have populated this table The following labs were prioritized for population, although others may be included: Labs are coded as LOINC codes and DataMarts maps their source data to these codes A study interested in lab values should plan to do a study-specific data characterization first to better understand the data for the specific lab Key Fields LOINC Specimen Date Result Result date Reference range A1C Creatinine Troponin (Trop T quan, Trop T qual, Trop I) INR HGB LDL Creatine kinase (CK, CK_MB, CK_MBI)
Condition Table Approximately 70% of DataMarts have populated this table There is no standardization for the condition variable No off-the-shelf tool to query this table Key Fields Condition Onset & resolve date Report date Status (e.g. active) Source (e.g. patient-reported)
PRO_CM Table Approximately 10% of DataMarts have populated this table Primarily PROMIS items (small subset) No off-the-shelf tool to query this table
Death Table Approximately 80% of DataMarts have populated this table Source: deaths reported in EHR and additional sources that DataMarts have access to (e.g. NDI, state death files) No off-the-shelf tool to query this table Key Fields Death date Source
Death Cause Table Approximately 30% of DataMarts have populated this table Cause of death coded as: ICD-9 ICD-10 No off-the-shelf tool to query this table Key Fields Cause of death Source
Additional Tables Harvest Contains information about the specific PCORnet DataMart implementation PCORnet Trial Populated for patients in clinical trials
Querying the Data in the PCORnet CDM
Overview of Query Process Code is developed centrally by the Coordinating Center (CC) Query package and results are sent through a secure user portal (PopMedNet) Querying steps: CC sends query to participating DataMarts DataMarts run the query DataMarts upload aggregate results to PopMedNet CC compiles results from DataMarts and sends to requestor Currently only CDRNs are queried via PCORnet query tool
PCORnet Query Types: Menu-Driven Queries Simple interface for query creation incorporated into the PCORnet Query Tool Query securely distributed to sites for execution against the local database (RDBMS) in PCORnet CDM format Query generates aggregate data for local review and secure return Asynchronous by design
PCORnet Query Types: SAS Queries SAS code package designed to execute against SAS datasets in PCORnet CDM format Distributed via the PCORnet Query Tool Downloaded and executed locally Response uploaded and returned via Query Tool
Issues to Consider for Distributed Querying Complexities to consider when developing a stable, distributed querying infrastructure Common Data Model conformance Local system implementation variability Source data size Programming efficiency Transparency Local system implementation variability Software and hardware Computing environments IT environments and configuration Local expertise
Next Steps – CDM v3.2 The DRN OC will develop the next version of the CDM. This expansion will include: additional elements to existing tables (e.g., more demographics, vitals, lab results) 1-2 new tables (may be optional); potentially an observation table This expansion needs to be informed by the CRGs All CRGs submit a request to DRN OC for elements/tables they need by May 31, 2017 with: A request form template for addition of data elements Detailed description of new tables/domains – data elements, potential source, any terminology/vocabulary standards, availability across partners, etc.
Next Steps – CDM v3.2 Form available on iMeet: https://pcornet.imeetcentral.com/p/aQAAAAAC-fU_
Next Steps – CDM v3.2 Provide as much detail as possible: Will help with assessment Will assist with future studies that seek to use these elements DRN OC will review all requests and identify overlap and opportunities for harmonization Draft CDM specification will be released & follow existing CDM processes Final draft specification will be submitted to the PCORnet Executive Committee for review and approval
Beyond CDM v3.2 Will not be able to incorporate everything into CDM v3.2 as a required element Observation table would provide a place for CRGs / studies to store elements of interest Studies could provide funding for partners to obtain these additional elements Other alternative will be to collect data as part of prospective clinical trials & merge with CDM
Nest Steps – CRG FD requests Multi-phased approach Phase I: Table 1 queries for each CRG using PROCEDURES and DIAGNOSES Phase II: After CDM expansion and additional tool development, we can reassess additional Table 1 queries
“Picture with Caption” format
Missingness Rates extra
Missingness Rates extra