Making It Happen March 19, 2013 Anita de Waard VP Research Data Collaborations, Elsevier RDS Sustainable Data Preservation and Use.

Slides:



Advertisements
Similar presentations
Creating an innovation network Filip Meuris Intercommunale Leiedal Smart Cities Project Director.
Advertisements

A centre of expertise in data curation and preservation DCC/NeSC eScience Workshop, June 2008 Working in partnership with the eScience community This work.
Project E: Citation Understanding the problem space Progress so far How you can contribute : afternoon session Lessons learned and challenges ahead Acknowledgements:
GETTING BITS OFF DISKS Using Open Source Tools to Prepare Born-Digital Materials for Long-Term Preservation and Access To connect to the audio portion.
A Virtual Research Environment for the Study of Documents and Manuscripts 1 1 Research administration Resource discovery Data creation, use and analysis.
Beyond the article: Altmetrics, publishing and marketing 1: AM, Altmetrics conference, London, 26 September 2014, Hans Zijlstra,
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
Ellysa Stern Cahoy Penn State University Personal Archiving and Scholarly Workflow An Exploratory Study of Penn State Faculty 4/4/2013.
Activity patterns in intellectual collaboration CSCW 2002 – Workshop 5 Peter Jones, Redesign Research, Dayton 16 November, 2002 A Representation of Practice.
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
Simon Briggs Department of Clinical Pharmacology University of Oxford 13 th June 2008 Data management – A researchers prospective.
Data, data standards and sharing Dr Daniel Swan Bioinformatics Support Unit
Research Data Service at the IT Pro Forum HEIDI IMKER, DIRECTOR.
Questions from a patient or carer perspective
Dissecting e-Labs & i- Labs. T. Jordan, I2U2 Meeting, December 2005 What is a laboratory? A scientist might say it is a place where people: collaborate.
Research Data Management Philip Tarrant Global Institute of Sustainability.
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
Overview: FY12 Strategic Communications Plan Meredith Fisher Director, Administration and Communication.
Libraries as Partners in Research: the UC Curation Center’s Tools and Services UC3 Team University of California Curation Center California Digital Library.
ICSU World Data System - trusted data services for global science Michael Diepenbroek, Vice-Chair WDS-SC.
Research Data Management At the Smithsonian Using SIdora Nano Tech Working Group May 15, 2014.
Making Connections: SHARE and the Open Science Framework Jeffrey Open Repositories 2015.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iCommands and Other Data Store Resources.
Software Sustainability Institute Dealing with software: the research data issues 26 August.
OCLC Online Computer Library Center Digital Preservation with OCLC Digitization Standards: Issues & Updates Taylor Surface, OCLC.
Crux flexible, structured data reporting for funding agencies.
Life Cycle Models & Principles Jake Carlson Associate Professor of Library Science Data Services Specialist Purdue University Libraries.
Introducing Australia’s Terrestrial Ecosystem Research Network: linking disciplines for better environmental outcomes. Nikki Thurgate.
Towards Data Attribution & Citation in the Life Sciences Philip E. Bourne UCSD 8/22/11Data Attribution and Citation.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
Data provenance in biomedical discovery Donald Dunbar Queen’s Medical Research Institute University of Edinburgh Workshop on Principles of Provenance in.
The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Electronic labnotes Mari Wigham COMMIT/. Information WUR  Organising, sharing, finding and reusing data  Expertise in: ● Modelling data.
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Research Data Management At the Smithsonian Using Sidora CNI December 10, 2013.
Institutional Repositories: the DSpace Experience Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
Dataset citation Clickable link to Dataset in the archive Sarah Callaghan (NCAS-BADC) and the NERC Data Citation and Publication team
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Towards a Structural Biology Work Bench Chris Morris, STFC.
Linking Embargoed Datasets: A Plan for Improving How Research Data Can Be Shared, Linked and Tracked Arlington, VA, November 19, 2015 Anita de Waard VP.
CI.III.1 Wider Adoption, Deployment, Utilization of a Cyberinfrastructure David De Roure.
CyVerse-enabled NCBI Sequence Read Archive (SRA) Submission Pipeline
Preliminary Findings Baseline Assessment of Scientists’ Data Sharing Practices Carol Tenopir, University of Tennessee
NSF Expeditions in Computing Managing Large, Distributed Projects: Best Practices and Challenges Vipin Kumar University.
Using the DMPTool for data management plans Kathleen Fear February 27, 2014.
Brian Nosek University of Virginia -- Center for Open Science -- Improving Openness.
| 1 Anita de Waard, VP Research Data Collaborations Elsevier RDM Services May 20, 2016 Publishing The Full Research Cycle To Support.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No West-Life.
Writing a successful data management plan Kathleen Fear October 17, 2013.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
TOWARDS AN ARCHITECTURE FOR NATIONAL DATA SERVICES Ian Foster Director, Computation Institute Argonne National Laboratory & The University of
Research Data Management From A Publisher’s Perspective
The Economics of Data Sharing
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Scientific Method A systematic method of attacking problems.
Research Data Management From A Publisher’s Perspective
Proposal Development Services
Making “Open Data” Work: Challenges for Data Integration in Genomics Research
Juliana Freire, Norbert Fuhr, Andreas Rauber
Building a CMMI Data Infrastructure
Getting Started with Data Management
An ontology for e-Research
Repository Platforms for Research Data Interest Group: Requirements, Gaps, Capabilities, and Progress Robert R. Downs1, 1 NASA.
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Presentation transcript:

Making It Happen March 19, 2013 Anita de Waard VP Research Data Collaborations, Elsevier RDS Sustainable Data Preservation and Use Making It Happen:

“What aspects/tools/capabilities/frameworks are related to this idea?” There are many different research databases– both generic (Dryad, Dataverse, …) and specific (NIF, IEDA, PDB, …)research databases There are many systems for creating/sharing workflows (Taverna, MyExperiment, Vistrails, Workflow4Ever etc)workflows There are many e-lab notebooks (LabGuru, LabArchives, LaBlog, etc)e-lab notebooks There are scores of projects, committees, standards, bodies, grants, initiatives, conferences for discussing and connecting all of this (KEfED, Pegasus, PROV, RDA, Science Gateways, Codata, BRDI, Earthcube, etc. etc)standardsconferences You can make a living out of this ;-)! (and many of us do…)

…but this is what scientists do: Using antibodies and squishy bits Grad Students experiment and enter details into their lab notebook. The PI then tries to make sense of this, and writes a paper. End of story.

Why save research data? A.Data Preservation: – Preserve record of scientific process, provenance – Enable reproducible research B.Data Use: – Use results obtained by others – Do better science! – Improve interdisciplinary work C.Sustainable Models: – Technology transfer; societal/industrial development – Reward scientists for data creation (credit/attribution) – Long-term archiving

> 50 My Papers 2 M scientists 2 M papers/year > 50 My Papers 2 M scientists 2 M papers/year Where The Data Goes Now: Majority of data (90%?) is stored on local hard drives Dryad: 7,631 files Dataverse: 0.6 M Datacite: 1.5 M Some data (8%?) stored in large, generic data repositories MiRB: 25k PetDB: 1,5 k TAIR: 72,1 k PDB: 88,3 k SedDB: 0.6 k A small portion of data (1-2%?) stored in small, topic-focused data repositories

> 50 My Papers 2 M scientists 2 M papers/year > 50 My Papers 2 M scientists 2 M papers/year Key Needs: Dryad: 7,631 files Dataverse: 0.6 M Datacite: 1.5 M MiRB: 25k PetDB: 1,5 k Majority of data (90%?) is stored on local hard drives Some data (8%?) stored in large, generic data repositories TAIR: 72,1 k PDB: 88,3 k SedDB: 0.6 k A small portion of data (1-2%?) stored in small, topic-focused data repositories INCREASE DATA PRESERVATION IMPROVE DATA USE DEVELOP SUSTAINABLE MODELS

Objections (and rebuttals) to data sharing: Objection:Rebuttal: “Our lab notebooks are all on paper – it’s how we do things” Graft tools closely on scientists’ daily practice “I need to see a direct benefit of any effort I put in.” Create tools to allow better insight in own and other’s results. “I don’t really trust anyone else’s data – and don’t think they’ll trust mine” Create social networking context and allow data owner to provide granular access control. “I am afraid other people might scoop my discoveries” => Reward system moves from a competition to a ‘shared mission’

Prepare Observe Analyze Ponder Communicate Prepare Observe Analyze Ponder Communicate From insular ‘CoSI-Factories’…

…to shared experimental repositories: Prepare Analyze Communicate Prepare Analyze Communicate Observations Across labs, experiments: track reagents and how they are used

Prepare Analyze Communicate Prepare Analyze Communicate Observations Compare outcome of interactions with these entities …to shared experimental repositories:

Prepare Analyze Communicate Prepare Analyze Communicate Observations Build a ‘virtual reagent spectrogram’ by comparing how different entities interacted in different experiments …to shared experimental repositories: Think

Grafting tools on workflow: create tailored metadata collection tools on mini-tablets in labs to replace paper notebook Direct rewards: through ‘PI-Dashboard’: allow immediate access/analysis of shared data: new science! Data sharing rewards: Data Rescue Challenge:: collect and reward stories/practices of data preservation/use in Earth/Lunar Science Improve data use: With NIF/Eagle-I: add antibodies as key ‘entities’ to paper, link to AB repository Some examples: c o n s o r t i u m

How do we make data use happen: We are creating repositories of shared experiments: you are part of a greater whole! Collect and share stories and practices re. data use and sustainable systems: “What gets to them?” Develop system of rewards for data sharing: enable demonstrably better science! Work with grant agencies, repositories (generic/specific, institutional, cross-national) to integrate and annotate existing datasets and enable cross-use Collectively pioneer long-term funding options; support/develop ‘shared mission’ funding challenges