Building a Global Research Data Community

Slides:



Advertisements
Similar presentations
Programme: 145 sessions & social events
Advertisements

From CESSDA to European Research Infrastructure Developments in cross-European data sharing.
Supporting National e-Health Roadmaps WHO-ITU-WB joint effort WSIS C7 e-Health Facilitation Meeting 13 th May 2010 Hani Eskandar ICT Applications, ITU.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Research Data Alliance Plenary 2 Update Dr. Francine Berman Chair, RDA/US Hamilton Distinguished Chair in Computer Science Rensselaer Polytechnic Institute.
Dr. Iris Berdrow Bentley College, Harvard Summer School.
Update on the Research Data Alliance June 2015 Updated: 8 th June 2015.
RDA Europe & National initiatives HILARY HANAHOE, TRUST-IT SERVICES, RDA SECRETARIAT & RDA EUROPE PROJECT COORDINATOR -
Update on the Research Data Alliance April  RDA community focuses on building social, organizational and technical infrastructure to  reduce.
Safety Driven Performance Conference 2013 The future of managing asset-intensive businesses John Keefe APM/RBMI Technical Manager Asset Integrity Services.
Fran Berman National and International Efforts in Research Data Access and Sharing Dr. Francine Berman Chair, Research Data Alliance / US Edward P. Hamilton.
Inspire services from the EuroGeographics point of view Antti Jakobsson Programme manager.
European Life Sciences Infrastructure for Biological Information ELIXIR
OECD Review of Russian Statistics Peer Review Mission to Russia April 2012 Tim Davis Head, Global Relations, Statistics Directorate.
OECD Organisation for Economic Co-operation and Development Organisation and Content Overview.
Environmental issues and local development Partnerships and the Green Economy Styria, 11 th October 2010 Gabriela Miranda
By: Victoria Macedo and Cody Carvahlo. To provide governments with a setting to discuss effective approaches to economic and social issues. Allows similar.
Session Chair: Peter Doorn Director, Data Archiving and Networked Services (DANS), The Netherlands.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
Towards a European network for digital preservation Ideas for a proposal Mariella Guercio, University of Urbino.
From science to market – cooperation, coordination and competition Arvid Hallén, Director General, Research Council of Norway.
GEO Work Plan Symposium 2012 ID-01 Advancing the GEOSS Data Sharing Principles.
Building the Research Data Alliance Dr. Beth Plale Vice Chair of Technology Programs, RDA/US Indiana University.
The MICHAEL Project is funded under the European Commission eTEN Programme The multilingual catalogue of digital cultural heritage in Europe.
A L I M E N T A T I O N A G R I C U L T U R E E N V I R O N N E M E N T 1 G20 – 12th May 2011 An International Research Initiative for Wheat Improvement.
1 1 Environmental Performance Reviews OECD ENVIRONMENTAL PERFORMANCE REVIEWS FOCUS ON THE SECOND CYCLE Christian Avérous World Bank, Washington 18 January.
25-September-2005 Manjit Dosanjh Welcome to CERN International Workshop on African Research & Education Networking September ITU, UNU and CERN.
Professor Jim Lynch Chief Executive, Forest Research, GB.
Introduction to the OECD. 4 key questions Who are we? What do we do? How do we do it? What happens next?
Metal Working Group Metals: conservation and Research - Brussels, October 2003 The ICOM-CC Metal Working Group as a promoter of research in metal.
BigSkyEarth: Opportunities for LSST in Europe Dejan Vinkovic
Launched March at UN Statistical Commission in side event.
How RDA is growing? Total RDA Community Members: 2668.
What is PIAAC?.
Overview of WGs, IGs and BoFs
Priorities for International Development of e-Infrastructure and Data Management in Global Change Research Presentation by Robert Gurney, University of.
What is the Belmont Forum?
The 5 minutes tour of CERN The 5 minutes race of CERN
The IECEE Global Motor Energy Efficiency Programme
Strategic Management and Strategic Competitiveness
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
CESSDA – for what and for whom?
Batteries, chargers and charging stations
FP7 – ICT Theme a motor for growth, competiveness and social inclusion
Chair, FACCE-JPI Governing Board
Six Sigma Total Error Percent Process Sigma 1,000, ,000 10% 2.78
Research Data Alliance - Research Data Sharing without barriers Terena Networking Conference 22 May 2014.
-
Overview of GEOSS Data Sharing Principles and Implementation GEO-XIII Plenary Side Event Towards Open Earth Observation Data Policies Wenbo Chu GEO.
Introduction to The Open Group
ISO TC268 SC1 - Smart Community Infrastructures -
RDA Big Data Infrastructure WG
The 1680 Family’s Reach.
Greg Tananbaum ScholarNext Consulting November 4, 2016
Electrification Products
The Research Data Alliance - What’s going on in Europe?
ORCID y la comunidad global
Health Related Issues Source: Odd Stuff Magazine.
Overview of working draft v. 29 January 2018
See The Change USA (STC) and Colorado Geographic Alliance (COGA)*
“Integrating Microbial Knowledge into Human Life”
DARIAH General Meeting Ljubljana, 2015 April 22nd
ASTRONET Coordinating strategic planning for European Astronomy
Bird of Feather Session
Update on the Research Data Alliance July 2015
Digital transformation of tax administration
A Framework for the Governance of Infrastructure - Getting Infrastructure Right - Jungmin Park, OECD Budgeting & Public Expenditures Division 2019 Annual.
CERN: from fundamental sciences to daily applications
Electrification business
Presentation transcript:

Building a Global Research Data Community Dr. Francine Berman Chair, Research Data Alliance / US Hamilton Distinguished Professor in Computer Science, Rensselaer Polytechnic Institute

Research Data Driving Solutions to Complex Science and Societal Challenges How does disease spread in large urban populations? What is the likely impact of a large-scale earthquake? Disease spread models couple Medical data with Population data, Envt. Data, Org. Data … Earthquake simulations integrate Fault geometry data with Subsurface data, Structure data, Population data, Physical infrastructure data, … How can we increase wheat yields? Productivity analysis leverages interoperability of Germplasm data with Genetic and phenotypic data, Statistical data, Bibliographic data, … Fran Berman

Sharing and Exchange of Data Leading to an Acceleration of Innovation

Data Sharing in Astronomy: More life in the Universe? June, 2013: Announcement of discovery of 3 “habitable” planets around the star Gliese 667C -- ~129 trillion miles (~ 22 light years) away* Gliese 667C is 1/3 the size of the sun. Habitable (liquid water could exist, etc.) planets are more massive than earth. International research team includes researchers from Germany, U.K., U.S., Finland, Chile *Paper published in Astronomy and Astrophysics http://www.eso.org/public/archives/releases/sciencepapers/eso1328/eso1328a.pdf

Data-Driven Discovery “We identified 3 strong signals in the star before, but it was possible that smaller planets were hidden in the data. We re-examined the existing data, added some new observations and applied two different data analysis methods especially designed to deal with multi-planet signal detection.” Guillem Anglada-Escude’ University of Gottingen, Germany

Data infrastructure needed: The Data “Backstory” Data mostly from ESO Science Archive Facility. Data sets include HARPS-TERRA Doppler measurements, HIRES, and PFS. Observations based in part from W. M. Keck Observatory, Magellan team (Doppler measurements), SIMBAD Data base (CDS in France). Data infrastructure needed: Data management, hosting and preservation infrastructure Data analysis tools Standards for astronomy data and metadata. Organizational policy and adoption in the use of standards. Community practice and technological infrastructure enabling data sharing between researchers

Social Infrastructure Data-driven Innovation Data-Driven Research Requires an Ecosystem Impact is dependent on effective development and integration of all components Social Infrastructure Data-driven Innovation Results Publications Apps … Data-driven Research Technical Infrastructure Organizational Policy Institutional Practice Standards Bodies Legal and Regulatory Frameworks Economic Models Community Organizations … Tools Systems Frameworks Databases Data Portals Data Centers …

Technical Infrastructure, Social Infrastructure Provide the Foundation for Data-Driven Research Data Access Data Sharing Data Visualization Data Analysis Data Services Data Mining Data Sharing Practice Data Management Digital Object Identifiers Common Metadata Standards Data Citation Standards Data Access and Distribution Policy Tools and infrastructure that promote Discoverability Data Preservation Data Storage Sustainable Economic Model

Resources and Resource Refresh SDSC Data Storage Growth ‘97-’09 Data Sharing Pre-Supposes Data Stewardship – including supporting infrastructure and economic models Resources and Resource Refresh Costs / components of Data infrastructure include Maintenance and upkeep Software tools and packages Utilities (power, cooling) Space Networking Security and failover systems People (expertise, help, infrastructure management, development) Training, documentation Monitoring, auditing Reporting costs Costs of compliance with regulation, policy, etc. … SDSC Data Storage Growth ‘97-’09 Most valuable data replicated As research collections increase, storage capacity must stay ahead of demand Information courtesy of Richard Moore, SDSC

Data Economics: Who Pays the Bill for Public Access to Research Data? Science article: Science Magazine, August 9, 2013

Potential Economic Solutions to Support Public Access Research Data in All Sectors Multiple approaches can provide Stewardship Options: PRIVATE SECTOR: Facilitate private sector stewardship of public access research data ACADEMIC SECTOR: Use public sector investment to jumpstart sustainable academic sector stewardship solutions PUBLIC SECTOR: Create and clarify public sector stewardship commitments for public access research data RESEARCH COMMUNITY: Encourage research culture change to take advantage of what works in the private sector

Social Infrastructure Data-driven Innovation Data-Driven Research Requires an Ecosystem Impact is dependent on effective development and integration of all components Social Infrastructure Data-driven Innovation Results Publications Apps … Data-driven Research Technical Infrastructure Organizational Policy Institutional Practice Standards Bodies Legal and Regulatory Frameworks Economic Models Community Organizations … Tools Systems Frameworks Databases Data Portals Data Centers …

The Power of Many: Community effort needed to move efforts beyond individuals / institutions “Just do it” -- Focused efforts help communities drive tangible progress Creation / adoption of data sharing policies have accelerated research innovation Development of a public access to shared data collection enabling new results for Alzheimer’s Development and adoption of parallel communication protocols through the MPI Forum drove a generation of advances Now 25 years old, the Internet Engineering Task Force’s mission “to make the Internet work better” has resulted in key specifications of Internet community standards that support innovation

Development of Data Sharing Infrastructure has emerged as a Global Priority

The Research Data Alliance (RDA) Global community-driven organization launched in March 2013 to accelerate data-driven innovation RDA focus is on building the social, organizational and technical infrastructure to reduce barriers to data sharing and exchange accelerate the development of coordinated global data infrastructure

Goal of RDA Infrastructure: Support Data Sharing and Interoperability Across Cultures, Scales, Technologies Common metadata standards Interoperability / integration framework Data access and preservation policy and practice Harmonized standards Common economic model for sustaining data Digital object identifiers Tools for data discoverability Etc. … Harmonized standards Policy and Practice

RDA Launch March 2013 Gothenburg, Sweden Over 200 participants 31 countries 5 continents > 6,400 tweets Public, private, academic sectors High-profile Govt. and Science speakers Left photo courtesy of Leif Laaksonen

CREATE  ADOPT  USE RDA Members come together as Working Groups – 12-18 month efforts to build, adopt, and use specific pieces of infrastructure Interest Groups – longer-lived discussion fora that spawn Working Groups as specific pieces of infrastructure to build are identified. RDA efforts focus on the development and use of data sharing infrastructure Focused pieces of adopted code, policy, infrastructure, standards, or best practices that enable data sharing “Harvestable” efforts for which 12-18 months of work can eliminate a roadblock Efforts that have substantive applicability to groups within the data community, but may not apply to everyone Efforts for which working scientists and researchers can start today

Current RDA Interest Groups RDA Interest Groups focusing on what infrastructure is needed for data sharing and exchange Current RDA Interest Groups Marine Data Harmonization Data Citation Preservation e-Infrastructure Agricultural Data Interoperability Publishing Data Big Data Analytics Repository Audit and Certification Brokering Structural Biology Data in Context Community Capability Model Defining Urban Data Exchange for Science UPC Code for Data Digital Practices in History and Enthnography Engagement Gorup Legal Interoperability (joint with CODATA) …

RDA Digital History and Ethnography Interest Group What factors increase your risk of getting asthma? What factors increase your ability to get better? How do organizations respond to asthma? How do communities respond to asthma? Information courtesy of Kim Fortun

Asthma as a socio-cultural-health issue How is asthma contracted, experienced and cared for in neighborhoods, cities, and countries? Relevant data: Health, environmental, population, socio-cultural, historical data, etc. in the form of images, video, oral interviews, surveys, HIPAA- compliant data, etc. Information courtesy of Kim Fortun

RDA Working Groups Creating a Pipeline of Impact-focused Deliverables PID Information Types Data Type Registries Data Foundation and Terminology Practical Policy Metadata Standards Standardization of Data Categories …

RDA Persistent Identifier (PID) Information Types Working Group RDA Working Group Focus: Harmonization of basic information types associated with persistent identifiers Agreement about the information associated with PIDs allows programmers to implement the same API independent of the PID type being used Impact: Facilitates type harmonization and interoperability between data sharing tools in different infrastructures and domains Deliverables: Community-vetted list of common information types; framework to introduce more types; mechanisms to develop profiles, collections and typed references Prototype API service for requesting PID information Adopters: Infrastructure supported by CNRI, California Digital Library API will be used in practice by DKRZ, Data Conservancy, DARIAH (research data infrastructure in arts and humanities), EUDAT (EU Data Infrastructure collaborative)

Current Status: RDA Community = > 850 participants from 50+ countries Albania Greece Serbia Australia Iceland Singapore Austria India South Africa Bangladesh Iran South Korea Belgium Ireland Spain Bulgaria Italy Sweden Brazil Japan Switzerland Canada Krygrystan Taiwan China Kuwait Turkey Congo Netherlands United Arab Emirates Czech Republic New Zealand Denmark Norway United Kingdom Estonia Palestine United States Finland Poland … France Portugal Germany Russia Fran Berman

RDA Plenary 2: September 16-18 in Washington, DC National Academies and Washington Marriott, September 16-18 Working Meeting for RDA Interest Groups and Working Groups Plenary Speakers and Panels RDA Business Meeting and Community Events “Neutral space” to convene communities RDA Plenary 3 in Dublin Ireland, March 26-28 2014, hosted by Australia and Ireland

RDA – How to Get Involved (rd-alliance.org) Join RDA to participate as an individual member. Register at rd-alliance.org. Membership is free. Join as an Organizational Member (nominal fee) or an Organizational Affiliate (jointly sponsored efforts) Initiate or Join an Interest Group Community members exploring infrastructure in topical areas Propose or Join a Working Group (focused12-18 month efforts with measurable outcomes that accelerate data sharing and exchange) Come to an RDA Plenary. Registration at rd-alliance.org

Building a Global Research Data Community Requires a Multi-Pronged Approach and Strategic Coordination Broad Coordination Technical infrastructure Social infrastructure Data-Driven Research Practice and Policy Community Organizations Data Centers, Repositories, Archives Data tools, Systems, Standards

Thank You! enquiries@rd-alliance.org bermaf@rpi.edu