Presentation is loading. Please wait.

Presentation is loading. Please wait.

EXPLORING DATA AND COHORT DISCOVERY IN THE SYNTHETIC DERIVATIVE.

Similar presentations


Presentation on theme: "EXPLORING DATA AND COHORT DISCOVERY IN THE SYNTHETIC DERIVATIVE."— Presentation transcript:

1 EXPLORING DATA AND COHORT DISCOVERY IN THE SYNTHETIC DERIVATIVE

2

3 Feasibility & Hypothesis Testing The RecordCounter exploratory The Synthetic Derivative Record Counter (RecordCounter) provides exploratory data figures and counts to members of the VU research community for research planning purposes and feasibility assessment. Available to ANYONE with the VUNET id Allows the user to input basic medical data, such as ICD 9 codes or text keywords, e.g., lung cancer, as well as demographic information, and then search the Synthetic Derivative database to determine the approximate number of records that meet those criteria. Can start investigating immediately….. Can start investigating immediately…..

4 Rich, multi-source database of de-identified clinical and demographic data User Interface tool that can be used for access and analysis Services are available to help deliver results for non-standard queries (temporal queries, controls matching, etc) Contains ~2.3 million records ~1 million with detailed longitudinal data averaging 100k bytes in size an average of 27 codes per record Records updated over time and are current through 6/30/2014, soon to be updated to 10/31/2014 Secondary Use of Clinical Data What is the Synthetic Derivative (SD) ?

5 The RecordCounter Vs. The SD counts The RecordCounter – Users can use search criteria to return exploratory counts (The results returned are not exact and are meant for a high level assessment of the available data.) The SD - User can use search criteria to returns exact count and the associated longitudinal data for review.

6 What is the Research Derivative (RD)? identified Fully identified repository of integrated clinical data with tight IRB/DUA access requirements Contains ~2.3 million records Updates regularly and is typically about 4 weeks behind the present date There is no tool supporting the Research Derivative and all access to the data must be through programming support Synthetic Derivative has proven transformative, but lacks ability to support: 1.Seasonality Studies; 2.Outbreaks and other date-specific studies (catastrophes, etc); 3.Find a specific patient (e.g. to contact)

7 What is BioVU? BioVU is the Vanderbilt DNA biorepository of DNA extracted from discarded blood collected during routine clinical testing and linked to information in the Synthetic Derivative. 212,059 Current sample number: 212,059 168,014 adult samples 23,280 pediatric samples

8 Resources for EMR-based research at VUMC 8 The Synthetic Derivative A de-id and continuously-updated version of the EMR (~2.3 M records ) BioVU DNA samples available: >212,000 Expansion efforts underway Redeposited genotypes Subjects with GWAS data: >13,000 Subjects with any genotyping: >60,000 > 8,000,000,000 genotypes 8

9 Record Counter (Feasibility/Hypothesis) BioVU = SD + Genotyping Data Synthetic Derivative (De-ID EMR Information)

10 1)Self-service tools available at no - or low - cost for researchers; fee-for- service 2)Customized tools and data extraction services using a fee-for- service agreement with researchers to sponsor ORI programmers when existing self-service tools are not adequate to fulfill complex use cases.

11 Scientific Portfolio

12  Documents, such as: Clinical Notes Discharge Summaries History and Physicals Problem Lists Surgical Reports Operative Notes Progress Notes Letters  Diagnostic Codes, Procedural Codes  Reports (pathology, ECGs, echocardiograms)  Lab Values and Vital Signs  Medications  TraceMaster (ECGs)  Tumor Registry Synthetic Derivative Data Types

13 Technology + policy De-identification Derivation of 128-character identifier (RUI) from the MRN generated by Secure Hash Algorithm (SHA-512) HIPAA identifiers removed using combination of custom techniques and established de-identification software Date Shift Our algorithm shifts the dates within a record by a time period (up to 364 days backwards) that is consistent within each record, but differs across records Restricted access & continuous oversight Access restricted to VU; not a public resource IRB approval for study (non-human) Data Use Agreement Audit logs of all searches and data exports

14 Creating Phenotypes Definition of phenotype for cases and controls is critical – May require consultation with experts Basic understanding of data elements; uses and limitations of particular data points is important Reviewing records manually to make case determination (or even to calculate PPV of search methodology) will be somewhat time consuming

15 The problem with ICD9 codes ICD9 give both false negatives and false positives negatives False negatives: Outpatient billing limited to 4 diagnoses/visit Outpatient billing done by physicians (e.g., takes too long to find the unknown ICD9) Inpatient billing done by professional coders: omit codes that don’t pay well can only code problems actually explicitly mentioned in documentation positives: False positives: Diagnoses evolve over time -- physicians may initially bill for suspected diagnoses that later are determined to be incorrect Billing the wrong code (perhaps it is easier to find for a busier clinician) Physicians may bill for a different condition if it pays for a given treatment Example: Anti-TNF biologics (e.g., infliximab) originally not covered for psoriatic arthritis, so rheumatologists would code the patient as having rheumatoid arthritis

16 Phenotyping Approach Algorithm Development Identify phenotype of interest Case & control algorithm development and refinement Manual review; assess precision Deploy in BioVU ≥95% <95%

17 Phenotype Algorithm Development Definition of phenotype for cases and controls is critical –May require consultation with experts Basic understanding of data elements; uses and limitations of particular data points is important Reviewing records manually to make case determination (or even to calculate PPV of search methodology) will be somewhat time consuming

18 Once you have logged in… Your Dashboard A welcome and announcement section to give the Investor any immediate information/Help when accessing the SD Projects and sets found on the left hand side On the dashboard add project teams to sets you have created Overall SD/BioVU population demographics with to give an up-to-date population details of the resource

19 Drag and Drop Search for Clinical Features Same interface as the Record Counter Can create complex logic statements with OR, AND, & NOT. Can limit search to look only at subjects in BioVU

20 User friendly Record Review Interface Subjects listed on the Left hand side Filter and search functionality Status designation

21 Data Visualization Features In the Summary tab and in the Vitals view, the new SD has new data visualization features that allow a reviewer to get a quick view of a subject’s longitudinal data.

22 Easy Search and Filtering for Document Review

23 Export Data Detailed data to a text files Demographic and annotations to REDCap

24 New Directions… Plasma in BioVU Plasma in BioVU - Pilot project is underway to establish a program to bank plasma in the areas of biomarker discovery (heart failure), antibody therapy (breast cancer) & medication adherence (resistant hypertension) PathLink PathLink – A tissue repository that will collect and store leftover tissues obtained during the course of standard medical care. Tissue samples and data will be linked to other clinical databases and BioVU. ImageVU ImageVU - Linking images such as MRIs and PET scans to the RD and SD Additional Data Sources….

25 SD Access Protocol Researcher Requests IRB Exemption Signs DUA Researcher accesses SD SD staff verify/ access granted Enters StarBRITE to complete electronic application (IRB status is in StarBRITE)

26 Questions or Comments? SD Help Sessions will be held the second and fourth Wednesday of each month at 1 pm. All are welcome. Time: 1:00-2:00 PM Location (2 nd Wed): 2525 West End, 600 conference room Location (4 th Wed): Light Hall, Room 437 If you have any questions or feedback about the SD, please contact us, email Jacqueline.Kirby@Vanderbilt.eduJacqueline.Kirby@Vanderbilt.edu


Download ppt "EXPLORING DATA AND COHORT DISCOVERY IN THE SYNTHETIC DERIVATIVE."

Similar presentations


Ads by Google