Fran Berman National and International Efforts in Research Data Access and Sharing Dr. Francine Berman Chair, Research Data Alliance / US Edward P. Hamilton.

Slides:



Advertisements
Similar presentations
Moving Forward With Digital Preservation at the Library of Congress Laura Campbell Associate Librarian for Strategic Initiatives Library of Congress.
Advertisements

Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
To facilitate readily accessible research infrastructure data to advance our understanding of Earth systems through an international community-driven effort,
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
BELMONT FORUM E-INFRASTRUCTURES AND DATA MANAGEMENT PROJECT Updates and Next Steps to Deliver the final Community Strategy and Implementation Plan Maria.
TRAC / TDR ICPSR Trustworthy Digital Repositories.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Update on the Research Data Alliance June 2015 Updated: 8 th June 2015.
Research Data Service at the IT Pro Forum HEIDI IMKER, DIRECTOR.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
Update on the Research Data Alliance April  RDA community focuses on building social, organizational and technical infrastructure to  reduce.
GEO Work Plan Symposium 2012 ID-05 Resource Mobilization for Capacity Building (individual, institutional & infrastructure)
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Data Science for International Data Week 2016: Concept Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Research Data Alliance Data Sharing Dr. Kathleen Fontaine Managing Director, RDA/US GEO Data Sharing Working Group May 2015 Geneva, Switzerland.
Research Data Alliance Future Directions Consultation August, September 2015 Updated: 18 th August 2015.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
A new start for the Lisbon Strategy Knowledge and innovation for growth.
U.S. Department of the Interior U.S. Geological Survey A vision for a global community Linda Gundersen Director Science Quality and Integrity US Geological.
ESIP Federation: Connecting Communities for Advancing Data, Systems, Human & Organizational Interoperability November 22, 2013 Carol Meyer Executive Director.
Dr. Fran Berman, RPI Feedback from BRDI Sponsor Forum 11/11 January 29, 2012 Fran Berman.
Building the Research Data Alliance Dr. Beth Plale Vice Chair of Technology Programs, RDA/US Indiana University.
Hydro DWG at the RDA Plenary: BoF and Aligning HDWG work with WMO expectations and timeline Sylvain, Tony, Silvano, Ilya.
Dr. Fran Berman, RPI BRDI Sponsor Forum 2/13. Dr. Fran Berman, RPI Focus: Discussion of planned BRDI activities and key interests of sponsors Improving.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Summary of RDA Outputs so far dr. Ir. Herman Stehouwer 22 September 2015.
Block 7: Reports Back to Plenary Group on CE and CI Working Group Activities Tasks and Activities -- October 22 DataONE Kick-off Meeting October 20-22,
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
All you wanted to know about The Research @RDA_US
1 RDA and Metadata Peter Fox (my view) Metadata session
Midwest Big Data Hub Edward Seidel Director, NCSA Founder Prof. of Physics, Prof of Astronomy On behalf of the Midwest Big Data Hub 1 Brian Athey Sarah.
Hydro DWG at the RDA Plenary BoF - Improve sharing of water resource data globally 24 September BREAKOUT :30-15:00.
Chaitan Baru Senior Advisor for Data Science CISE Directorate National Science Foundation NIEHS Webinar October 27, 2015 Image Credit: Exploratorium. Integrating.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
Introduction to XBRL Consortium Newcomers Session Presenter: Grant Boyd - Vice Chair XBRL-Marcomm. General Manager - Corporate Services - Institute of.
RDA/US Adoption Seed Projects RDA/US is partnering with four groups as part of the MacArthur 2016 Adoption Seeds program Bringing visibility to food security.
EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number EGI vision for the EOSC Tiziana.
All you wanted to know about The Research @RDA_US
Research Data Alliance - Research Data Sharing without barriers Terena Networking Conference May 2014.
RDA in a nutshell 18 May 2016
The National Digital Stewardship Alliance: Stewardship, Collaboration, Inclusiveness, Exchange.
A Shared Commitment to Digital Preservation and Access.
NETWORKS OF EXCELLENCE KEY ISSUES David Fuegi
Capacity Building in: GEO Strategic Plan 2016 – 2025 and Work Programme 2016 Andiswa Mlisa GEO Secretariat Workshop on Capacity Building and Developing.
EGI-InSPIRE EGI-InSPIRE RI EGI strategy towards the Open Science Commons Tiziana Ferrari EGI-InSPIRE Director at EGI.eu.
Global Water Information Interest Group meeting RDA 7 th Plenary, 1 st March 2016, Tokyo Global Water Information Interest Group Welcome to the inaugural.
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
Overview of WGs, IGs and BoFs
South Big Data Innovation Hub
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Research Data Alliance - Research Data Sharing without barriers Terena Networking Conference 22 May 2014.
-
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Research Data Alliance
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Building Efficiency Accelerator:
Jisc Research Data Shared Service (RDSS)
Bird of Feather Session
Research Data Alliance/US Briefing for the OA
Update on the Research Data Alliance July 2015
Building a Global Research Data Community
EOSC-hub Contribution to the EOSC WGs
Presentation transcript:

Fran Berman National and International Efforts in Research Data Access and Sharing Dr. Francine Berman Chair, Research Data Alliance / US Edward P. Hamilton Distinguished Professor in Computer Science, RPI

Fran Berman Research Data Driving Solutions to Complex Scientific and Societal Challenges Who is most at risk to contract asthma? How can we increase wheat yields? How accurate is the Standard Model of Physics? Image: Lucas Taylor How can we best address energy needs and sustain the environment ? Image: Ceinturion, Wikipedia

Fran Berman Data Infrastructure Needed to Explore Solutions Data Use and Re-use Data Discovery and Data Sharing Research Dissemination and Reproducibility Data Access (now) and Preservation (later) Data discoverability tools Data access via portals, science gateways, etc. Database and data collection systems Data services to support use and re-use Data analysis algorithms Data-driven models and simulations Data visualization tools Semantic frameworks Data management systems Data storage …

Fran Berman Social, Organizational, and Human Infrastructure Equally Important Policy Sustainable Economics Common Standards Community Practice Social and Organizational Infrastructure Human Infrastructure / Workforce Data-focused Curriculum and Training Data Scientists McKinsey Global Institute 2011 Report, Traffic Image: Mike Gonzalez

Fran Berman Today’s Presentation: Emerging Efforts in the Development of Effective Research Data Infrastructure Global Data Infrastructure How do we accelerate open access data sharing and exchange? National Data Infrastructure How do we support stewardship and preservation of publicly accessible research data?

Fran Berman Data-Sharing Driving Discovery Across Sectors and Communities

Fran Berman World-wide Efforts Focusing on Infrastructure to Support Research Data Sharing, Access, Use Science, Humanities, Arts Communities E-Infrastructure professionals, data analysts, data center staff, … Data Scientists Libraries, Archives, Repositories, Museums

Fran Berman Research Data Alliance Created to Accelerate Development of Research Data Sharing Infrastructure Worldwide  RDA is an emerging, global community- driven organization created to accelerate the development of research data- sharing infrastructure world-wide.  RDA community efforts focus on building social, organizational and technical infrastructure to  reduce barriers to data sharing and exchange  accelerate the development of coordinated global data infrastructure

Fran Berman RDA Approach: CREATE  ADOPT  USE RDA Members come together as Working Groups – month efforts to build, adopt, and use specific pieces of infrastructure Interest Groups – longer-lived discussion forums that spawn Working Groups as specific pieces of needed infrastructure are identified. Working Group efforts focus on the development and use of data sharing infrastructure Code, policy, infrastructure, standards, or best practices that are adopted and used by communities to enable data sharing “Harvestable” efforts for which months of work can eliminate a roadblock Efforts that have substantive applicability to groups within the data community, but may not apply to everyone Efforts for which working scientists and researchers can start today

Fran Berman Map courtesy traveltip.org traveltip.org Austral- pacific 4% Africa 2% South America 1% The RDA Community Today: Over 1600 members from 70+ countries (as of 15/3/14) Asia 4%

Fran Berman Community Growth RDA Launch / First Plenary March 2013 RDA Second Plenary September 2013 RDA Third Plenary March 2014 First RDA organizational telecon: August 2012 Global Data Planning Meeting: October 2012 First Working Groups and Interest Groups 240 participants First “neutral space” community meeting (Data Citation Summit) First Org. Partner Meet-up First BOFs 380 participants from 22 countries RDA Fourth Plenary September 2014 First Organizational Assembly 6 co-located events 14 BOF, 12 Working Groups, 22 Interest Groups 497 participants Amsterdam First Working Group exchange meeting RDA Plenary 2 Washington, DC RDA Plenary 1 / Launch Gothenburg, Sweden RDA Plenary 3 Dublin, Ireland

Fran Berman RDA Interest (IG) and Working Groups (WG) by Focus (as of 15/3/14) Domain Science - focused Toxicogenomics Interoperability IG Structural Biology IG Biodiversity Data Integration IG Agricultural Data Interoperability IG Digital History and Ethnography IG Defining Urban Data Exchange for Science IG Marine Data Harmonization IG Materials Data Management IG Data Stewardship - focused Research Data Provenance IG Certification of Digital Repositories IG Preservation e-infrastructure Long-tail of Research Data IG Publishing Data IG Domain Repositories IG Global Registry of Trusted Data Repositories and Services IG Base Infrastructure - focused Data Foundations and Terminology WG Metadata Standards WG Practical Policy WG PID Information Types WG Data Type Registries WG Metadata IG Big Data Analytics IG Data Brokering IG Reference and Sharing - focused Data Citation IG Data Categories and Codes WG Legal Interoperability IG Community Needs - focused Community Capability Model IG Engagement IG Clouds in Developing Countries IG

Fran Berman First RDA Infrastructure Deliverables coming this Fall Data Type Registries WG Deliverables: System of data type registries, formal model for describing types, working model of a registry. Initial Adopters and Users: CNRI, International DOI Foundation, Deep Carbon Observatory Practical Code Policies Deliverables: Survey of policies in production use, testbed of machine actionable policies, deployment of 5 policy sets, policy starter kits Initial Adopters and Users: RENCI, DataNet Federation Consortium, CESNET, Odum Institute, EUDAT Persistent Identifier Information Types Deliverables: Minimal set of PID types, API Initial Adopters and Users: Data Conservancy, DKRZ Language Codes Deliverables: Operationalization of ISO language categories for repositories. Initial Adopters and Users: Language Archive, Paradisec Data Foundations and Terminology Deliverables: Common vocabulary for data terms, formal definitions and open registry for data terms Initial Adopters and Users: EUDAT, DKRZ, Deep Carbon Observatory, CLARIN, EPOS Metadata Standards Deliverables: Use cases and prototype directory of current metadata standards starting from DCC directory Initial Adopters and Users: JISC, DataOne

Fran Berman RDA/US Goals:  Contribute to RDA “international” efforts and leadership  Bring US efforts to broader RDA community  Build the RDA community within the US  Leverage and implement RDA deliverables in the US to amplify impact  Collaborate closely with other RDA “regions” on key programs and initiatives RDA/US: Collaborate Globally, Contribute Locally NSF-supported RDA/US initiatives: Outreach (RDA  RDA/US) RDA Deliverables Amplification Student / Early Career Engagement RDA/US Steering Committee Fran Berman, RPI Larry Lannom, CNRI Mark Parsons, RPI Beth Plale, IU

Fran Berman RDA/US Opportunities for Students and Early Career Professionals RDA/US Interns –$5K for summer of work/mentorship with RDA Interest or Working Group –Interns attend Fall Plenary ($2500 participant support) and present a poster on their project –Interns attend a kick-off meeting at the beginning of the summer. RDA/US Fellows –Fellows engage with an RDA WG/IG and attend 3 Plenaries ($2.5K per Plenary participant costs) –First Plenary: Identify a group to work with –Second and Third Plenaries: Present interim and final progress on common efforts

Fran Berman Sustainable Stewardship to Support Data-Driven Innovation Global Data Infrastructure How do we accelerate open access data sharing and exchange? National Data Infrastructure How do we support stewardship and preservation of publicly accessible research data?

Fran Berman Increasing R&D Agency Requirements for Data Access and Management Research Data Infrastructure particularly important

Fran Berman Publicly Accessible Data has to Live Somewhere Public Access, Use, and Re-Use of Data Now and in the Future Presupposes Sustainable Stewardship Today Stewardship and Preservation are critical: “Homeless” data ceases to exist Economically sustainable data infrastructure necessary to support –Federally mandated data management plans –Public access to research data –Use and re-use –Reproducibility The “bigger”, more long-term, more complex, or more valuable the data is, the greater the importance of sustainable data stewardship and infrastructure

Fran Berman It’s Not Just “Big Data” and It’s Not Just the Cost of Storage. Data Management, Stewardship, and Use Incur Continuing Infrastructure Costs Most valuable data replicated As research collections increase, storage capacity must stay ahead of demand Information courtesy of Richard Moore, SDSC Resources and Resource Refresh Costs include Maintenance and upkeep Software tools and packages Utilities (power, cooling) Space Networking Security and failover systems People (expertise, help, infrastructure management, development) Training, documentation Monitoring, auditing Reporting costs Costs of compliance with regulation, etc. SDSC Data Storage Growth ‘97-’09

Fran Berman Economics of Public Access: Who Pays the Data Bill? Article: Science Magazine, August 9, Free public access link at

Fran Berman Op-Ed Recommendations: Partner Across Sectors to Distribute the Preservation and Stewardship Responsibilities Charleston Ballet blog: ; iTunes gift card Evolve research culture to take advantage of what works in the private sector Create sustainable university library and repository stewardship solutions Clarify public sector stewardship commitments: articulate what data will / won’t be supported Facilitate private sector stewardship of public access research data as a public good Private Sector Public Sector Individuals Academia

Fran Berman Value Proposition: Why Data Infrastructure Is Important The Research landscape is changing Data is accelerating new innovation and discovery Greater need for access, ease-of-use, interoperability of data Traditional modes of research recognition evolving: new approaches to collaboration / competition, publication, citation, analysis all involve digital data The Educational landscape is changing University curricula becoming more data-driven Increasing integration of on-line / on-site options supported by data infrastructure More digital monitoring, tracking, accountability needed; more policy and regulation involving digital data The Workforce is changing More data literacy required from everyone More data science embedded in everything Data scientists increasingly critical for competitiveness and leadership Image: CAIDA Internet visualization; Article: HBR October 2012

Fran Berman Your part: Things you can do on Monday morning Small steps: 1.If you don’t have one, create a data management plan for your current project for a reasonable fixed term of time 2.Make your data available to the community (as appropriate) by curating it and ingesting it into a publicly accessible repository 3.Cite and publish your data when you write about your results 4.Join the RDA and get involved in (or start) an Interest Group or Working Group that will help you develop needed data infrastructure.

Fran Berman Thank You!

Fran Berman Infrastructure Investments Often a Hard Sell … –Quantifying return on investment a challenge –Hard to “market” compared to more urgent competing priorities –Business model must be sustainable and address infrastructure refresh and evolution Stephanie A. Miner, the Syracuse mayor, said [infrastructure is] too often overlooked when politicians want to spend money on economic development. “You don’t cut ribbons for new water mains, but that’s really what matters.” NY Times, Feburary 15, 2014