Download presentation
Presentation is loading. Please wait.
Published byRamona Dumitrescu Modified over 5 years ago
1
First teleconference/web session Dec 11, 2015
Working Group 6 Criteria for Repository Inclusion: Standards, Interoperability, Sustainability, etc. First teleconference/web session Dec 11, 2015
2
Agenda 1. Introductions 2. Brief review of bioCADDIE
3. Goals for this group Timeline Discuss and check for understanding 4. Repositories/sources already indexed 5. The list of potential repositories/sources 6. Reactions, comments
3
WG6 – Current Membership
George Alter - ICPSR, University of Michigan Dianne Babski - National Library of Medicine Tanya Barrett - NCBI (GEO, BioSample, BioProject), GA4GH Kei Cheung - Yale University Tim Clark - Harvard Medical School, FORCE11 Data Citation Implementation Group Larry Clarke - National Cancer Institute (NCI) Ian Fore - NIH Marek Grabowski - University of Virginia Jeffrey Grethe - University of California San Diego Chelsea Ju - University of California Los Angeles Chirag Lakhani - Harvard Medical School Matthew McAuliffe - Center for Information Technology NIH Neil McKenna - Baylor College of Medicine Lucila Ohno-Machado - University of California San Diego Thomas Radman - NIH Jim Rehg - Georgia Institute of Technology Susanna-Assunta Sansone - University of Oxford and Nature Publishing Group Alisa Surkis - New York University School of Medicine Griffin Weber - Harvard Medical School Justin Wood - University of California Los Angeles John Yates - The Scripps Research Institute Wenchao Yu - University of California Los Angeles 2/17/20192/17/2019 Supported by the NIH grant #xxxxxxxxx to the University of California, San Diego
4
Introductory Logistics
bioCADDIE Web Site White paper Under Resources Working Groups Menu Working Group 6 Or Google “biocaddie wg6” !
5
bioCADDIE – Working Groups location on the web site
6
User Interface Prototype
UI webpage address: datamed.biocaddie.org User name: biocaddie Password: biocaddie
7
WG6 - Goals GOAL ACTIVITIES/RESPONSIBILITIES DELIVERABLES
Obtain consensus from multiple NIH representatives on which data sets NIH wants to see indexed by bioCADDIE Determine which features(criteria) these sets have in common for future selection of data sets Determine process of review of criteria for newly proposed datasets ACTIVITIES/RESPONSIBILITIES Assemble an authoritative group of a minimum of 4 NIH officers and 4 bioCADDIE executive committee members to discuss criteria to select repositories using the DDI prototype. Decide which repositories will be used for the prototype DELIVERABLES Recommended metadata for data inclusion Contact information for NIH-selected data sets Standard for persistence and preservation Investigation of access requirements
8
Balancing Act for Criteria
What researchers/repositories can provide? Which criteria program officers will endorse? Metadata quality
9
Criteria for Inclusion in the Initial Prototype
Key data resources used by the community Aligns with concept of the Commons Pilots Cancer Genomics Cloud Pilots and Genomic Data Commons Human Microbiome Project Model organism databases Facilitate development of indexing methodology Ensure broad coverage of types of data Examples not yet represented Clinical data, Imaging data “The long tail”? The hardest to find The Variety component of big data There is good convergence between the bioCADDIE emphasis on indexing highly accessed datasets and the idea of the Commons. The Cancer Genomics Cloud Pilots and Genomic data commons both seek to make available data which are highly used by cancer researchers. These will not be in credits model funded cloud – but they are part of a broader Commons.
10
NIH representation on WG6
Represent NIH repositories Intramural and extramural Facilitate collaborations Represent NIH program staff Criteria they are willing to endorse in their programs
11
Background relevant to WG6
Metadata specification Can repositories provide this? Identifiers A non-prescriptive approach Does it work for repositories? Core Development work Pipeline for reading data from repositories Material on all the above on website How do we supplement this?
12
Repositories already in progress
Source Status PDB, GEO Stable BioProject, ArrayExpress, dbGAP, GEMMA Ongoing Library of Integrated Network-based Cellular Signatures (LINCS) program Reviewing API details for structure Inter-university Consortium forPolitical and Social Research (ICPSR) Reviewing sample file for structure Jeff to speak to
13
Future The next 10 repositories The overall list of repositories
In Google Docs The overall list of repositories
14
Thank you! Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.