Download presentation
Presentation is loading. Please wait.
Published byTracy Earl Carter Modified over 9 years ago
1
George A. Komatsoulis, Ph.D. National Center for Biotechnology Information National Library of Medicine National Institutes of Health U.S. Department of Health and Human Services
3
The Commons Digital Objects (with identifiers) Search (Indexed Metadata and API) Computing Platform Open APIs Software Encapsulation
4
The Commons Digital Objects (with identifiers) Search (Indexed Metadata and API) Computing Platform Commons Federation (Infrastructure) BD2K Centers DDICC (Search) Existing Resources Indexes Methods Content
5
Commons Federation (Infrastructure) BD2K Centers DDICC (Search) Existing Resources Indexes Methods Content Works In Searches
6
Commons Federation (Infrastructure)
7
Commons Implemented as a federation of ‘conformant’ cloud providers and HPC environments Funded primarily by providing credits to investigators
8
Cost effective - Only pay for IT support used Drives competition – Better services at lower cost Supports Data sharing by driving science into the Commons Facilitates public-private partnership Scalable to most categories of data expected in the next 5 years.
9
Novelty: Never been tried, so we don’t have data about likelihood of success Cost Models: Predicated on stable or declining prices among providers True for the last several years, but we can’t guarantee that it will continue, particularly if there is significant consolidation in industry Service Providers: Predicated on service providers willing to make the investment to become conformant Market research suggests 3-5 providers within 2-3 months of program launch Persistence: The model is ‘Pay As You Go’ which means if you stop paying it stops going Giving investigators an unprecedented level of control over what lives (or dies) in the Commons
10
Minimum set of requirements for Business relationships (reseller, investigators) Interfaces (upload, download, manage, compute) Capacity (storage, compute) Networking and Connectivity Information Assurance Authentication and authorization Likely to be reviewed self-certification in pilot phase A conformant cloud ≠ an IaaS provider
11
Likely to evolve into multiple ‘Levels of Compliance’ corresponding to increasing degrees of making data/software meet ‘FAIR’ criteria. Some of our current thinking for basic compliance Objects are physically or logically available in the Commons Objects are indexed with a usable identifier Objects have basic search metadata attached to index entries Objects have clear access rules Objects have basic semantic metadata available Higher levels could include Objects indexed with standards based identifiers (ORCID, doi, etc.) Objects are open to the public (or as open as reasonable given data type) Objects conform to agreed upon standards (CDISC, DICOM, etc.) Data objects are accessible via standard APIs Software is encapsulated (containers, other technology) for easier usage We want and need your feedback on these matters!
12
Phase 0: Build the plumbing Phase 1: Pilot the model on a small number of investigators experienced with cloud computing, probably within the context of BD2K awards Phase 2: Open the Commons credit process to grantees from a subset of NIH Institutes and Centers Phase 3: Open the process to all NIH grantees
15
Approved March 23, 2015 “In light of the advances made in security protocols for cloud computing in the past several years and given the expansion in the volume and complexity of genomic data generated by the research community, the National Institutes of Health (NIH) is now allowing investigators to request permission to transfer controlled-access genomic and associated phenotypic data obtained from NIH-designated data repositories under the auspices of the NIH Genomic Data Sharing (GDS) Policy to public or private cloud systems for data storage and analysis.” Responsibility for ensuring the security and integrity remains with the institution.
17
1960197019801990200020102020
18
Sensor Stream = 500 EB/day Stores 69 TB/day Collection = 14 EB/day Store 1PB/day Total Data = 14 PB Store an average of 3.3TB/day for 10 years!
20
NIH Office of ADDS Vivien Bonazzi, Ph.D. Philip Bourne, Ph.D Michelle Dunn, Ph.D Mark Guyer, Ph.D. Jennie Larkin, Ph.D. Leigh Finnegan Beth Russell NCBI Dennis Benson, Ph.D. Alan Graeff David Lipman, MD Jim Ostell, Ph.D. Don Preuss Steve Sherry
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.