Making Data from NIAAA Funded Grants

Slides:



Advertisements
Similar presentations
TISSUE BANKING Challenging to Say the Least
Advertisements

Research Performance Progress Report (RPPR) Grantees may access a list of progress reports that are due using the Status page in eRA Commons, and selecting.
National Perspective of Healthy People 2020 Penelope Slade-Sawyer, P.T., M.S.W. HHS Office of Disease Prevention and Health Promotion 18 th Annual Healthy.
Responsible Conduct of Research & Research Compliance Adam J. Rubenstein, Ph.D. Director of Research Compliance Old Dominion University Office of Research.
1Data Structures | Data Elements Finding Disease Data: The Autism Example Finding the Needle in the Haystack February 26, 2013 Greg Farber, Ph.D. Director.
National Database for Autism Research (NDAR) Central Repository Access Request Procedure.
DESIGNING A PUBLIC KEY INFRASTRUCTURE
Alzheimer’s Disease Neuroimaging Initiative STEERING COMMITTEE April
A Primer on Healthcare Information Exchange John D. Halamka MD CIO, Harvard Medical School and Beth Israel Deaconess Medical Center.
Supportive Services for Veteran Families (SSVF) Data
IAN Exchange Linking people and ideas to advance autism research.
Community Business Intelligence Project Full Roll-Out Implementation Kick-Off HSPs with Validated Vendors December 2013.
Scientific Data as Research Infrastructure: The Biomedical Sciences Strategies for Economic Sustainability of Publicly Funded Data Repositories Board on.
1 Matthew J. McAuliffe, Ph.D., Chief, Biomedical Imaging Research Services Section (BIRSS) CIT Ramona Hicks, Ph.D., Program Director, Repair and Plasticity.
South Africa Data Warehouse for PEPFAR Presented by: Michael Ogawa Khulisa Management Services
Future Use of Stored Samples & Data and the NIH Policy on GWAS and dbGaP NIAID/DAIDS Dione Washington, M.S. -- ProPEP Sudha Srinivasan, Ph.D.-- TRP Tanisha.
Anticipated FY2016 Appropriations Agency$ Million NIH200 Cancer70 Cohort130 FDA10 Office of the Natl Coord. for Health IT (ONC) 5 TOTAL215 Mission: To.
3 June 2010National Academies - BRDI1 Research Data and Information: Recent Developments and Continuing NIH Interests Jerry Sheehan Assistant Director.
TUESDAY, 4:00 – 4:20PM WEDNESDAY, 4:00 – 4:20PM Douglas Hill, NHIN Implementation Lead (Contractor), Office of the National Coordinator for Health IT Vanessa.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Chapter 6 – Data Handling and EPR. Electronic Health Record Systems: Government Initiatives and Public/Private Partnerships EHR is systematic collection.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Integrating a Federated Healthcare Data Query Platform With Electronic IRB Information Systems Shan He IPHIE 2010.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
The NDAR Model for Publishing Findings in the Life Sciences Dan Hall Manager, National Database for Autism Research, NIMH.
Health Management Information Systems Unit 3 Electronic Health Records Component 6/Unit31 Health IT Workforce Curriculum Version 1.0/Fall 2010.
Sickle Cell Disease (SCD) in Sub-Saharan Africa (SSA) Collaborative Consortium Research Funding Opportunity Announcement (RFA HL ) Technical Assistance.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Function BIRN The ability to find a subject who may have participated in multiple experiments and had multiple assessments done is a critical component.
International Planetary Data Alliance Registry Project Update September 16, 2011.
The Regulation on Cell Therapy Products in Japan
© 2016 Chapter 6 Data Management Health Information Management Technology: An Applied Approach.
Enhancements to Galaxy for delivering on NIH Commons
IRB Open House: Implementation of Single IRB Review
REDCap General Overview
Healthcare Science careers
Biostatistics Resources for Clinical and Translational Research
A FRUIT AND VEGETABLE PRESCRIPTION PROGRAM
Accessing the VI-SEEM infrastructure
How To Design a Clinical Trial
Epidemiology and Genomics Research Program
Electronic Medical and Dental Record Integration Options
Using Open Data in Research
Getting Started with the Grants Portal Grants
Cisco Data Virtualization
Ian Bird GDB Meeting CERN 9 September 2003
Dining with Diabetes IRB Training 2017.
Instructor Course Evaluation (ICE)
Data challenges in the pharmaceutical industry
Getting Started with the Grants Portal Non-Grants
EDRN’s Validation Study Information Management System
Outbreak Management and Notifiable Disease Surveillance System Integration: Merlin and the Merlin Outbreak Module Janet Hamilton, MPH Communicable Disease.
Role of peer review in journal evaluation
Tennessee Longitudinal Data system (TLDS)
Updates on U.S. Spending Transparency Improvements
Mental Health Data Alliance, LLC (MHData) June 7th , 2018
Gary Mendell, Founder and CEO
Preparing for NIH’s sIRB Review Requirements
NIH Public Access Policy
New NIH Human Subjects & Clinical Trials Information
Informed Consent (SBER)
Manuscript Transcription Assistant Initiative
Proposal Processing Wake Forest University Health Sciences
22nd Annual CUNA Lending Council Conference
The ultimate in data organization
TOPMed Analysis Workshop Genetic Analysis Center Biostatistics Department University of Washington TOPMed Data Coordinating Center August 7-9, 2017 Introduction.
Network-wide Milestones – Plan to Address & Achieve Domains of focus for supplemental funding request. Sites will work with workgroups to generate milestones.
ECU Foundation Xtender Application
Offender Health: Why Should We Care?
REACHnet: Research Action for Health Network
Presentation transcript:

Making Data from NIAAA Funded Grants Available to the Research Community Greg Farber Office of Technology Development and Coordination National Institute of Mental Health

Why Do We Care About Making Data Available? Understanding the biological basis of human disease is a very hard problem, and we are NOT making progress quickly enough. There are two reasons why this is hard: the underlying biology is complex individual variation is surprisingly large For “simple diseases” (genomic diseases with high penetrance, many diseases caused by an infections agent), big data is not terribly useful. In these cases, we “just” need to deal with the biology. Most of what we deal with today are complex diseases where the individual variation and the environment are important components. In these cases, we need to both understand the biology as well as have data from many individuals to understand individual variation and the number of “sub-groups” in a population.

Many Components of NIH are Trying to Make Data from Human Subjects Available All of Us Research Program (formerly Precision Medicine) NHLBI makes their clinical trials available via BioLINCC. NIDA also makes data from the clinical trials they have funded available through the NIDA Data Share web site. The NIH Genomic Data Sharing policy expects NIH funded investigators to submit data to an appropriate repository. The new 21st Century Cures Act seems to give the NIH director the authority to require data sharing.

NIMH Data Archive NIMH has created a data infrastructure to hold data from experiments involving human subjects. That infrastructure now holds data from nearly 600 NIH funded awards as well as data supported by other funding agencies. Data types include: Clinical assessments Imaging and other “complex” data (eye tracking, EEG, PET…) Genomics data in the area of autism The data infrastructure has matured to the point where it is now possible to expand to areas like substance use.

Non-NIH Groups Using the NDA to Store Data

A Brief History The National Database for Autism Research was started in late 2006, and the first data was received in 2008. NIMH recently decided to expand NDAR to include data from: Clinical Trials (NOT-MH-14-005) The Research Domain Criteria (RDoC) Initiative (NOT-MH-15-012) The Adolescent Brain Cognitive Development Study All of the data is part of a single database (NIMH Data Archive, NDA) with branded web locations (https://data-archive.nimh.nih.gov/).

NIH/NIMH Data Archives Staff

NDA Overview NDA is a federal data repository. The NDA only contains data from human subjects. We have the ability to manage data with different types of consent, but NIMH sites contain data that is broadly consented for use by the research community. NIMH data are available to the research community through a not too difficult application process that involves a data access committee (Currently support 4 independent DACs). Summary data are available to everyone with a browser. The data types include demographic data, clinical assessments, imaging, –omic data, and other complex data types (EEG…). Currently share data from nearly 130,000 subjects with the research community. ~800TB of imaging, –omic, and other complex experimental data is secured in the Amazon cloud.

NDA Implementation NDA has deep federation with the following data repositories. This federation allows NDA to query data in those repositories and to return data to the user from multiple repositories simultaneously. Autism Tissue Program Autism Genetic Resource Exchange Interactive Autism Network Simons Foundation Autism Research Initiative Ontario Brain Institute NDA has two key features to allow data standardization and aggregation: data dictionaries and the Global Unique Identifier (GUID) Generally, NIMH funded investigators are expected to share their data via NDA. Investigators with funding from other sources are also welcome to deposit their data.

NDA Structure It is best to think of NDA as a large (~130,000 research participants x ~130,000 data dictionary elements), sparse, two dimensional matrix.

Data Dictionary – The First Building Block The NDA data dictionary is one of the key building blocks for this repository. It provides a flexible and extensible framework for data definition by the research community. 1500+ data collection instruments, freely available to anyone 130,000+ unique data elements (“questions”) Data collection instruments are defined research community with assistance from NDA staff Clinical Genomics/Proteomics MRI Modalities Other complex data (EEG, Eye Tracking) Accommodates any data type and data structure Curated by NDA Staff Allows investigators to quickly perform quality control tests of their data without submitting data anywhere.

Data Dictionary List (1500+ Measures)

Inside a Data Dictionary

Data Inspection – Available to All

Global Unique Identifier – the Other Building Block The NDA GUID software allows any researcher to generate a unique identifier using some information from a birth certificate. If the same information is entered in different laboratories, the same GUID will be generated. This strategy allows NDA to aggregate data on the same subject collected in multiple laboratories without holding any of the personally identifiable information about that subject. NDA also assigns unique identifiers that do not allow data aggregation (pseudo-GUID) in cases where the GUID could not be generated. The GUID is now being used in other research communities (see http://www.youtube.com/watch?v=Tb6euCVoous)

General Query – IAN Example – GUID Works

Query for Data by Laboratory/Award

Example of Information from a Particular Lab

Retrieve Data Associated with a Paper

A “Study” – Data Associated with a Publication

By Concept/Phenotype Results in 1,061 subjects being discovered

Existing Substance Use Data Not surprisingly, a number of NIMH funded clinical trials and some clinical research studies have data related to substance use. Addiction Severity Index, N=65 Fagerstrom, N=1,034 Peer Substance Use, N=1,028 Peer Tolerance of Substance Use, N=1,029 Smoking History Questionnaire, N=194 Substance Abuse Disorders Log, N=196 Substance Use Monthly Form, N=404 Substance Use Questionnaire, N=2,055 Substance Use Survey, N=322 We expect that some NIAAA funded researchers will have collected data related to mental health.

How to NIMH Users Deposit Data? At the start of the award, a data submission agreement is signed and the data archive creates data dictionaries that will be required for the user to submit data. Every 6 months, data are submitted to the data archive. At that time, data are checked by the validation tool to make sure they conform to the data dictionary. Submitting data is separate from releasing data to the research community (sharing). Sharing happens once a paper is published or one year after the completion of the grant. We have created a cost estimator and we ask our awardees to request data sharing costs when they submit applications. The costs are generally modest.

Researchers ARE using the NIMH Data Archive Number of Registered Users Number of Collections

Advantage to NIAAA and Funded Investigators NIAAA can use the existing infrastructure to get more out of data. Secondary data analysis Combination of multiple related data sets Increasing confidence in conclusions using data measured by a different group The quality of the data will increase The research community can work with NIAAA to establish minimal common data elements which will enhance the ability to merge/compare data from different laboratories.