Digital Curation Centre

Slides:



Advertisements
Similar presentations
A centre of expertise in data curation and preservation DCC/NeSC eScience Workshop, June 2008 Working in partnership with the eScience community This work.
Advertisements

DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Good practice in Research Data Management Module 5: Deposit and long-term preservation.
OVERVIEW & LIBRARY SUPPORT FOR DATA MANAGEMENT/SHARING Jim Van Loon, MSME/MLIS Science Librarian.
Selecting a Data Sharing Repository. 2 Why Share Data? Enabling others to replicate and verify results as part of the scientific process Allows researchers.
Data Management Planning Kerry Miller Digital Curation Centre University of Edinburgh DIY Research Data Management Training Kit for.
Managing your research data: University support for researchers Sally Rumsey The Bodleian Libraries University of Oxford Mary Harssch
A centre of expertise in data curation and preservation MIS Seminar :: University of Edinburgh :: 2 October 2006 Funded by: This work is licensed under.
Because good research needs good data Research Data Management for Researchers University of Aberdeen 7 th October 2014 Jonathan Rans Digital Curation.
Data Publishing & Management Learning Objectives: 1.Introduce the advantages of publishing your data, the steps involved and how to publish to increase.
Open Exeter Project Team
Research Data Management: The Basics Open Exeter Project team.
August 14, 2015 Research data management – an introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
EPSRC expectations on research data: What researchers need to know 12/03/2015 Masud Khokhar and Hardy Schwamm.
Research Data Management at University of Aberdeen & RGU 7 th October 2014 This work is licensed under a Creative Commons Attribution 2.5 UK: Scotland.
Data Citation: the next big thing… ?!?! 1 Victoria University 20 Nov
1 Guidelines For The Future Sharing Best Practice For National Bibliographies In The Digital Era Neil Wilson Information Coordinator IFLA Bibliography.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
The Digital Curation Lifecycle Model Joy Davidson and Sarah Jones
Login / Upload / Share Deposit your scholarly research - it’s as easy as 1, 2, 3 MAIN MESSAGE key reasons enumerated ->please read speaker notes id / who.
USE AND REUSE Research data locally and globally Kevin Ashley Digital Curation Centre Reusable with attribution:
Digital/Open Access repositories Paul Sheehan Director of Library Services DCU HEAnet National Networking Conference Athlone 11 th November 2005.
Because good research needs good data The DCC lifecycle model, Exeter Uni, 19 May 2012 Funded by: The Digital Curation Lifecycle Model Joy Davidson and.
October 24, 2015 Research data management – a brief introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
Because good research needs good data Funded by: Digital Curation for Researchers, 28th February 2013 The Shifting Research Data Management Policy Landscape.
Because good research needs good data The DCC lifecycle model, Exeter Uni, May 2011 Funded by: The Digital Curation Lifecycle Model Joy Davidson.
June 3, 2016 Research data management – an introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
ScholarSpace & Open UH Mānoa March 2013 Beth Tillinghast Web Support Librarian ScholarSpace & eVols Project Manager UHM Library.
1. 2 Rewards are real … but few (yet) 3 The citation benefit intensified over time... ...with publications from 2004 and 2005 cited 30 per cent more.
It’s the data that makes a paper Joerg Heber Executive Editor Nature Communications.
Dataset citation Clickable link to Dataset in the archive Sarah Callaghan (NCAS-BADC) and the NERC Data Citation and Publication team
Data Citation Implementation Pilot Workshop
Filling institutional repositories: considering copyright issues Susan Veldsman eIFL Content Manager
Aalto Research Data Management Policy Ella Bingham 8 April 2016 This work is licensed under the Creative Commons Attribution 4.0 International License.
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
Writing a Data Management Plan with the DMPTool Kathleen Fear January 15, 2015.
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
Because good research needs good data The DCC lifecycle model, Exeter Uni, May 2011 Funded by: The Digital Curation Lifecycle Model Joy Davidson.
Beyond the Repository: Research Systems, REF & New Opportunities William J Nixon Digital Library Development Manager.
Funders’ data policies and costs Sarah Jones DCC, University of Glasgow Twitter: sjDCC Funded by:
Open Access and Open Data Services at the University of Cambridge
NRF Open Access Statement
Open Exeter Project Team
Open Access and Research Data Management: An Overview for LLOs
ELIXIR Core Data Resources and Deposition Databases
EPSRC research data expectations and research software management
OceanDocs Digital Repository of Marine Science Research Outputs
Research Data Management From A Publisher’s Perspective
EPSRC Research Data Policy Awareness
Ways to upgrade the FAIRness of your data repository.
Karen Dennison Collections Development Manager
GFBio – Education module
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Institutional role in supporting open access, open science, open data
General Finnish DMP Guidance
Managing the Rights to Your Publications
Data Management: Documentation & Metadata
Rhodes Digital Commons: Raising the visibility of your research Research Week. 12th May 2017 Khawulile Radebe: Librarian: Repository & Metadata Debbie.
Open Access to your Research Papers and Data
An Introduction to Open Access and Research Data Management
Introduction to Research Data Management
OpenML Workshop Eindhoven TU/e,
Mission DataCite was founded in 2009 as an international organization which aims to: establish easier access to research data increase acceptance of research.
Research Data Management
Dataverse for citing and sharing research data
Research data lifecycle²
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

Digital Curation Centre Sharing Data Digital Curation Centre This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License.

Why make data available?

Selecting data

What must be shared? “Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner.” RCUK Common Principles on Data Policy “Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC What counts depends on data’s value for purposes it has served or may serve, so consider these as first step.

What data carries value? Indicators that data have value Quality of the data and its description complete, accurate, reliable, valid, representative etc Demand high known users, integration potential, reputation, recommendation, appeal Replication difficulty difficult, costly, or impossible to reproduce Low barriers legal/ ethical, copyright non-restrictive terms and conditions Rarity unique copy or other copies at risk Which related material does data depend on for its value?

e.g. High Energy Physics community Levels of data to preserve Reuse purpose Additional documentation (e.g. wikis, news forums) Publication-related information search Data in a simplified format Outreach, simple training analyses Analysis level software and the data format Full scientific analysis based on existing reconstruction Reconstruction and simulation software and basic level data Full potential of the experimental data Adapted from: DPHEP Study Group: Towards a Global Effort for Sustainable Data Preservation in High Energy Physics, May 2012 . http://arxiv.org/abs/1205.4667

What can’t be shared? Sensitive personal data Data with IP or Copyright restrictions Data that is too large to deliver over the network Physical data Copyright restrictions can be particularly complicated when data has been aggregated or reused.

But… It must be preserved It must be visible It may be accessible under certain conditions

Ensuring data is reusable

Ensure the data can be found Get a persistent identifier – e.g. DataCite DOI The likely home of an electronic research data resource attached to a persistent identifier is a data repository (exceptions being resources not available over the web which have a PI attached to a metadata record – physical resource, cumbersomely large dataset etc.)

Linking and citation Linking open data to publications increases citations Want evidence? Alter, Pienta, Lyle – 240%, social sciences * Piwowar, Vision – 9% (microarray data)† Henneken, Accomazzi – 20% (astronomy) # # Edwin Henneken, Alberto Accomazzi, (2011) Linking to Data - Effect on Citation Rates in Astronomy. http://arxiv.org/abs/1111.3618 * Amy Pienta, George Alter, Jared Lyle, (2010) The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data. http://hdl.handle.net/2027.42/78307 † Piwowar H, Vision TJ. (2013) Data reuse & the open data citation advantage. PeerJ PrePrints 1:e1v1 http://dx.doi.org/10.7287/peerj.preprints.1v1

Ensure it can be used appropriately Funders usually require datasets to carry information on access and any restrictions or conditions that apply

Data description - metadata Citable Findable Re-usable Documentation and metadata are essentially descriptive information about the information contained in a dataset. There should be good documentation at the study level, for example a description of the research methodology that created the data – [the best metadata the data can have is the publication it supports] or a data paper. There should also be documentation at the file, item and variable level suitable so that someone reusing the data can understand it – this could be ensuring that excel spreadsheets have sensible row and column descriptions or that a document is included with the dataset which properly explains any abbreviations used.

Ensuring the utility of the data The what, why and how data creation must be understood Data dictionaries Columns/rows labelled Variable ranges defined Ensuring that other researchers can understand and effectively reuse data that they access online without the help of the data’s creator is a more complicated task and requires a greater investment of effort to do successfully.

DCC metadata catalogue The catalogue lists: Metadata standards Profiles Use cases Tools http://www.dcc.ac.uk/drupal/resources/metadata-standards

Now the where…

Options for sharing open data Domain repository General repository – Figshare, Zenodo, Dryad Institutional repository Journal supplementary material Departmental web page Domain best General – need to ensure it’s suitable for your use Institutional – will keep data of low value or data with no other home but may not ensure targeted visibility Journal – fulfils journal requirements but does not offer repo functionality or longevity required by funders Web page – may target domain academics and allow manipulation of data and support for living datasets. Does not have citability, longevity or engender user trust that the resource is the same as the one described in publications.

Depositing in multiple locations A single location may not provide all that is needed Niche disciplinary repositories may not offer guaranteed longevity IR may hold prestige datasets Is it the repository of last resort?

Repository selection

A conversation with peers There may be an accepted repository used by peers or required by funders Multidisciplinary studies may not have an obvious home Data types and volumes will impact on decision

Journal’s guidance Journal of Open Psychology Data

Finding external repositories General directories Re3data.org Domain specific directories e.g. life sciences – Biosharing.org Data journal recommendations Edinburgh research data blog: Sources of dataset peer review Funding body recommendations E.g. Wellcome Trust Data repositories and database sources Data journals require that the data being described in articles be freely available and usually mandate where it should be deposited, this can help to identify community-accepted repositories. Some journals may also offer recommendations for appropriate places to deposit research data

Finding a repository Lists over 1300 data repositories Icons for ease of assessment Supported by DataCite

DCC guidelines - repositories Is the repository reputable? Will it accept the data you want to deposit? Will data be safe in legal terms? Will the repository sustain data value? Will the repository support analysis and track data usage? Will the data be citable? – Is a DOI provided

Any questions? Images Handles: https://www.clarin.eu/sites/default/files/handles.png Cherries from: http://www.tomrand.net Whispers from: http://www.manningcommunitynews.com/ Open access badges Jim Spencer from www.nature.com Closed data from: http://www.gsma.com