Data the NIH What is Happening & What is Coming A Conversation Philip E. Bourne, PhD, FACMI Associate Director for Data Science National Institutes.

Slides:



Advertisements
Similar presentations
Data, Data Everywhere, But Not a Byte to Eat Michael F. Huerta, Ph.D. Associate Director, National Library of Medicine Director, Office of Health Information.
Advertisements

ACCELERATING CLINICAL AND TRANSLATIONAL RESEARCH Anantha Shekhar MD, PhD Indiana University School of Medicine Indiana Clinical and.
Transformed (transforming) Health Care System Connie White Delaney, PhD, RN, FAAN, FACMI School of Nursing Professor & Dean Academic Health Center Director,
George A. Komatsoulis, Ph.D. National Center for Biotechnology Information National Library of Medicine National Institutes of Health U.S. Department of.
© 2007 IBM Corporation © 2009 IBM Corporation 1 Tran Viet Huan, PhD CTO, IBM Vietnam IBM Research Global Technology Outlook.
NIH Behavioral and Social Sciences Research: A Snapshot from OBSSR Deborah H. Olster, PhD Deputy Director Office of Behavioral and Social Sciences Research.
Vivien Bonazzi Ph.D. Program Director: Computational Biology (NHGRI) Co Chair Software Methods & Systems (BD2K) Biomedical Big Data Initiative (BD2K)
Science as an Open Enterprise: Open Data for Open Science Professor Brian Collins CB, FREng UCL, June 2012 Emerging conclusions from a Royal Society Policy.
The NIH Roadmap for Medical Research
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Directorate of e-Government1 e- Government Strategy for Kenya. By Peter Gakunu, Cabinet Office 23 rd March 2004.
Bill Newhouse Program Lead National Initiative for Cybersecurity Education Cybersecurity R&D Coordination National Institute of Standards and Technology.
Scientific Data as Research Infrastructure: The Biomedical Sciences Strategies for Economic Sustainability of Publicly Funded Data Repositories Board on.
Johns Hopkins Technology Transfer 1 Translational Biomedical Research: Moving Discovery from Academic Centers to the Community Translational Biomedical.
Science for Global Health: Fostering International Collaboration Norka Ruiz Bravo,PhD Special Advisor to the Director National Institutes of Health U.S.
Technology Update Barrington School Committee December 15, 2011.
Presenter: Karla Strieb Assistant Executive Director Transforming Research Libraries June 3, 2010 Supporting E-science: Progress at Research Institutions.
Overview: FY12 Strategic Communications Plan Meredith Fisher Director, Administration and Communication.
Recognition Of Team Science Faculty Appointments, Promotions and Titles at The Geisel School of Medicine at Dartmouth “Recognition by peers as an investigator.
Data! Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health.
National Institutes of Health U.S. Department of Health and Human Services Scale, Diversity, and Complexity: Tackling Data Challenges in the Superfund.
Managing Data: The Long View FORCE15 – 12 January 2015 Amy Friedlander, Ph.D.
SCIENCE, RESEARCH DATA, AND PUBLISHING Stewart Wills Editorial Director, Web & New Media, Science 26 February 2013.
National Science Foundation 1 Evaluating the EHR Portfolio Judith A. Ramaley Assistant Director Education and Human Resources.
Big Data to Knowledge (BD2K) Jennie Larkin, Ph.D. NIH RDA P5 March 10,2015.
Data Science at the NIH Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health.
NIH Big Data to Knowledge (BD2K) March 4, 2014 Peter Lyster National Institute of General Medical Sciences (NIGMS) NIH.
NIH Activities Related to Big Data Jerry Sheehan Assistant Director for Policy Development National Library of Medicine Board on Research Data and Information.
U.S. Department of the Interior U.S. Geological Survey A vision for a global community Linda Gundersen Director Science Quality and Integrity US Geological.
-- Don Preuss NCBI/NLM/NIH
8 October 2009Microbial Research Commons1 Toward a biomedical research commons: A view from NLM-NIH Jerry Sheehan Assistant Director for Policy Development.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
National Center for Research Resources NATIONAL INSTITUTES OF HEALTH T r a n s l a t I n g r e s e a r c h f r o m b a s i c d i s c o v e r y t o i m.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Funding: Staffing for Research Computing What staffing models does your institution use for research computing? How does your institution pay for the staffing.
ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe September 26-27, 2006 ARL Prue.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Why Write A Grant? Elaine M. Hylek, MD, MPH Professor of Medicine Associate Director, Education and Training Division BU CTSI Section of General Internal.
MD2K is an NIH Big Data to Knowledge (BD2K) Center of Excellence. Visit MD2K: Center of Excellence for Mobile Sensor Data-to-Knowledge.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
MPS Workshop 1: Gauging the Impact of Requirements for Public Access to Data November 19, 2015 Jennie Larkin, Ph.D. Office of the Associate Director for.
1 Why is Digital Curation Important for Workforce and Economic Development? Alan Blatecky Office of Cyberinfrastructure Symposium on Digital Curation in.
Biomedical Research, Health Disparities and the Digital Divide: Programs at the NCRR Susan R. Kayar, Ph.D. Division of Research Infrastructure.
NIH BioCADDIE / Force11 Data Citation Pilot Kickoff Meeting Nine Zero Hotel, Boston MA, 3 February 2016 Introduction: Tim Clark, Maryann Martone and Joan.
NIH: DATA SCIENCE & BD2K Jennie Larkin, PhD Senior Advisor, Extramural Programs and Strategic Planning Office of the Associate Director for Data Science,
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Research Data Lifecycle Management Workshop Report Curt Hillegas 9/8/2011.
Data NIH Philip E. Bourne, PhD Associate Director for Data Science National Institutes of Health Big Data Symposium, Lincoln,
The Vision for the NIH Philip E. Bourne, PhD, FACMI Associate Director for Data Science National Institutes of Health Bio-IT World, Boston April.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
David M. Murray, Ph.D. Associate Director for Prevention Director, Office of Disease Prevention Multilevel Intervention Research Methodology September.
National Institutes of Health U.S. Department of Health and Human Services Planning for a Team Science Evaluation ∞ NIEHS: Children’s Health Exposure Analysis.
Science for Global Health: Fostering International Research Collaboration James Herrington, PhD, MPH Director Division of International Relations Fogarty.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Herbadrop.
The NIH Data Commons: A Cloud-based Training Environment Philip E. Bourne, Ph.D. FACMI Associate Director for Data Science National Institutes of Health.
Enhancements to Galaxy for delivering on NIH Commons
NIH – A Vision Through 2020 Philip E. Bourne, PhD, FACMI Associate Director for Data Science
To develop the scientific evidence base that will lessen the burden of cancer in the United States and around the world. NCI Mission Key message:
Jennie Larkin, PhD Senior Advisor
Tools and Services Workshop
Joslynn Lee – Data Science Educator
NLM: Meeting Challenges & Seizing Opportunities in & with Big Data
Evaluation of NCI Research Resources
Computer Science Department, University of Missouri, Columbia
Making “Open Data” Work: Challenges for Data Integration in Genomics Research
UQ Resources Forum University of Queensland’s Sustainable Minerals Institute Professor Neville Plint Director.
Preprints and Other Interim Research Products NIH perspectives
Update from the National Institutes of Health (NIH)
Digital Policy -Transformation Towards Society 5.0-
Computer Services Business challenge
Presentation transcript:

Data the NIH What is Happening & What is Coming A Conversation Philip E. Bourne, PhD, FACMI Associate Director for Data Science National Institutes of Health March 31, 2015

This is Just the Beginning  Evidence: –Google car –3D printers –Waze –Robotics –Sensors From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee

Addressing the Opportunities & Challenges 6/12 Findings: Sharing data & software through catalogs Support methods and applications development Need more training Need campus-wide IT strategy Hire CSIO Continued support throughout the lifecycle 2/14 3/14

What Have I Learned Thus Far? ….  Working with the full spectrum of data types is challenging – “Xtreme translation”  A large ship takes a long time to stop and turn, but a great crew helps  That crew is in places I was not used to  There are complexities I could not have imagined going in based on the funding ecosystem

What Have I Learned Thus Far?  Policies take time when they come from the bottom up, but they may work are i.e. implemented and adhered to  Policies from the top down can be problematic  What you set out to do is often not what you end up doing e.g. precision medicine, “NLM rethink”  This is just the beginning …

Additional NIH Disruptors …

Early Findings  Bad News –We do not yet have a data sustainability plan –Global policies define the why but not the how –We do not know how all the data we currently have are used –We need to ramp up training programs in data science  Good news –Genuine willingness across the IC’s to address the problems –Global communities are emerging and should be nurtured –We are beginning to define & quantify the issues e.g. reproducibility –Disruptors accelerate change

Office of Biomedical Data Science Mission Statement To foster an open ecosystem that enables biomedical research to be conducted as a digital enterprise that enhances health, lengthens life and reduces illness and disability & to train the next generation of data scientists Goals expanded from recommendations in the June 2012 DIWG and BRWWG reports.

The BD2K Program is Central to the Mission Planned – Black; Available- Green

Elements of The Digital Enterprise Communities Policies Infrastructure Intersection: Sustainability Efficiency Collaboration Training

Elements of The Digital Enterprise Communities Policies Infrastructure Intersection: Sustainability Efficiency Collaboration Training Virtuous Research Cycle

Consider an example…

 Big Data: The study involved MRI images & GWAS data from over 30,000 people  Collaboration: Data came from many different sights affiliated with the ENIGMA consortium  Methods: To homogenize data from different sites, the group designed standardized protocols for image analysis, quality assessment, genetic imputation, and association  Found five novel genetic variants  Results provided insight into the variability of brain development, and may be applied to study of neuropsychiatric dysfunction

Policies: Now & Forthcoming  Data Sharing –Genomic data sharing announced –Data sharing plans on all research awards –Data sharing plan enforcement Machine readable plan Repository requirements to include grant numbers

Policies - Forthcoming  Data Citation –Goal: legitimize data as a form of scholarship –Process: Machine readable standard for data citation (done) Endorsement of data citation for inclusion in NIH bib sketch, grants, reports, etc. Example formats for human readable data citations Slowly work into NLM/NCBI workflow  dbGaP in the cloud (soon!)

BD2K Center BD2K Center BD2K Center BD2K Center BD2K Center BD2K Center DDICC Software Standards Infrastructure - The Commons Labs

The Commons Digital Objects (with UIDs) Search (indexed metadata) Search (indexed metadata) Computing Platform The Commons Vivien Bonazzi George Komatsoulis

The Commons: Compute Platforms The Commons Conceptual Framework The Commons Conceptual Framework Public Cloud Platforms Public Cloud Platforms Super Computing (HPC) Platforms Other Platforms ? Other Platforms ?  Google, AWS (Amazon)  Microsoft (Azure), IBM, other?  In house compute solutions  Private clouds, HPC –Pharma –The Broad –Bionimbus  Traditionally low access by NIH

The Commons: Business Model [George Komatsoulis]

NIH … Turning Discovery Into Health