Preserving and Sharing Data: Best Practices & Requirements for Selecting a Data Sharing Repository

Slides:



Advertisements
Similar presentations
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Advertisements

DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Selecting a Data Sharing Repository. 2 Why Share Data? Enabling others to replicate and verify results as part of the scientific process Allows researchers.
New business models for open research Todd Vision Jared Lyle Mark Hahnel 12-June-2014Open Repositories1.
Why you should apply a license to your data Data Licensing.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of Pretoria.
Open Exeter Project Team
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
Data Preservation Best Practices for preserving your research data for future reuse The goal of data preservation is to ensure that your data is in a sustainable.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
INTRODUCTION TO RESEARCH DATA MANAGEMENT Robin Desmeules Janice Kung J W Scott Health Sciences Library University of Alberta Libraries.
EZID (easy-eye-dee) is a service that makes it simple for digital object producers (researchers and others) to obtain and manage long-term identifiers.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
Presented by Ansie van der Westhuizen Unisa Institutional Repository: Sharing knowledge to advance research
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
CrossRef, DOIs and Data: A Perfect Combination Ed Pentz, Executive Director, CrossRef CODATA ’06 Session K4 October 25, 2006.
Libra: Thesis and Dissertation Submission. What is Libra? UVA’s institutional repository, providing online archiving and access for the scholarly output.
Data sharing & reuse Library – RDM Support Project Basic training course for information specialists.
Choosing Between Data Sharing Repositories for Social Sciences Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
UC3 Standards and Best Practices for Datasets and Other Supplemental Journal Article Materials UC3 Stephen Abrams Patricia Cruse John Kunze.
Choosing Between Data Sharing Repositories for the Humanities Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Research Data Management Victoria University Context Lyle Winton Adrian Gallagher Julie Gardner.
UVa Library Research Data Services
What can publishers do to support data? Dryad’s perspective STM Annual US Conference - April 22, 2015 Meredith Morovati Executive Director Illustration.
Choosing Between Data Sharing Repositories for the Life Sciences Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Data Management in Scholarly Journals and possible Roles for Libraries – Some Insights from EDaWaX Sven Vlaeminck | Leibniz-Information Centre for Economics.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
Finding and Acquiring Data: Discovering and Obtaining Data from Library & Non-library Sources Linking Open Data cloud diagram, by Richard Cyganiak and.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Making Data Accessible Yolanda Gil USC/ISI February 20, 2015 "To deposit or not to deposit, that is the question - journal.pbio g001"
Peter Granda Archival Assistant Director / Data Archives and Data Producers: A Cooperative Partnership.
Open Access and the Research Excellence Framework
Managing Access at the University of Oregon : a Case Study of Scholars’ Bank by Carol Hixson Head, Metadata and Digital Library Services
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
Issues in RDM This work is licensed under a Creative Commons Attribution 4.0 International LicenseCreative Commons Attribution 4.0 International License.
Filling institutional repositories: considering copyright issues Susan Veldsman eIFL Content Manager
Institutional Repositories and Licensing of Research Output advanced information management laboratory university of cape town department of computer science.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Using the DMPTool for data management plans Kathleen Fear February 27, 2014.
Writing a Data Management Plan with the DMPTool Kathleen Fear January 15, 2015.
Writing a successful data management plan Kathleen Fear October 17, 2013.
YOUR TITLE HERE Courtney Matthews, Digital Repository Librarian Web Advisory Committee April 20, 2016 uwspace.uwaterloo.ca Library Scholarly Communications.
PhD-course Research Data Management (RDM) Expert Centre Research Data.
NRF Open Access Statement
Jeff Moon Data Librarian &
Open Exeter Project Team
Fresno State Digital Repository
Other Researchers Get:
A robust scholarly repository that puts your UVA research center stage
OceanDocs Digital Repository of Marine Science Research Outputs
How NOT to share your data: Avoiding data horror stories
An Overview of Data-PASS Shared Catalog
Karen Dennison Collections Development Manager
ACS 2016 Moving research forward with persistent identifiers
and Scholarly Communication
Jay Bhatt Drexel University Libraries
CNI Spring 2010 Membership Meeting
Data stewardship life cycle
Metadata for research outputs management
OpenML Workshop Eindhoven TU/e,
Research Data Management
Hands-on Introduction and Refresher Course
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Dataverse for citing and sharing research data
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

Preserving and Sharing Data: Best Practices & Requirements for Selecting a Data Sharing Repository © 2014 by the Rector and Visitors of the University of Virginia. This work is made available under the terms of the Creative Commons Attribution-ShareAlike 4.0 International license Bill Corey Data Management Consulting Group University of Virginia Library

3 What does “sharing data” mean? Data sharing is the practice of making data used for scholarly research available to other investigators. Replication has a long history in science. The motto of The Royal Society is 'Nullius in verba', translated "Take no man's word for it.“ Many funding agencies, institutions, and publishers have policies regarding data sharing because transparency and openness are considered by many to be part of the scientific method.

4 Reasons to share my data… Receive scholarly credit Link to research products, create fuller picture Preserving data for future use Meet requirements (of funders, publishers) Some data are unique and cannot be replicated Other researchers can replicate and verify results Facilitate new discoveries Reduce duplication of data collection efforts Ease access to new researchers and citizen-scientists

5 What does “preserve data” mean? Data preservation is the practice of managing data so it can be accessed and re-used by future researchers. Usually this term is applied to digital files (digital data preservation), but it is also important for analog objects. “The goal of digital preservation is the accurate rendering of authenticated content over time.” Preserving data helps to provide future access to the scientific, historical, and cultural heritage record of our world.

6 Selecting a data sharing repository Questions to consider when selecting a repository or archive Does your publisher specify a location for the data supporting an article? Does your institution have specific requirements? Does your discipline recommend a specific repository or archive? Does your discipline recommend a specific repository or archive? Does your funder identify a specific location or facility?

7 Selecting a data sharing repository Best Practices Choose early Metadata Persistent Identifiers Data embargo Data access

8 Selecting a data sharing repository Requirements What kinds of data do they accept? Do they require specific file types? Do they have a rights or policy statement? Do they provide long-term preservation?

9 Information potential users of your data will be looking for When preparing to deposit your data in a repository, be sure to include this information in your documentation. Ownership of the data Permissions required for using the data Restrictions on reuse Documentation that makes it understandable and useful Data access – openly accessible, restricted access, or embargoed

10 Data Registries Finding a repository can be confusing and time consuming. International registry of data repositories: Re3data Re3data merged with Databib in 2015 under the DataCite umbrella. All datasets will be provided with DOIs.

11 Advantages of Data Repository Registries Registries provide information about the repositories that can be difficult to find. Do they provide Persistent Identifiers for finding and citing the data? Do they have access controls? What are their Terms of Use & Licenses? Are there guidelines for depositing data? Do they provide data preservation services? Do they have professional backup & documentation? What Repository Standards do they use? info:doi/ /journal.pone

12 Data Registries exercise Go to the registry: Re3data Identify your discipline or any that you are familiar with using the subject search, and see what options you can find. Some repositories will accept data, and some are simply data providers. What are your options?

13 Re3data icons What do the icons mean?

14 re3data.org re3data is a global registry for research data repositories. Researchers can use it to select the appropriate repositories for the permanent storage and sharing of their data. Repositories can be searched by name, country, subject, discipline, and content type. Clicking on the name will take you to a page with a description of the repository, including a URL. Record example

15 re3data.org The Institutions tab lists the Responsible Institutions, including their responsibilities - funding, technical, or general -- their URL and contact information. The Terms tab includes links to Copyright policies, types of data access, and data deposit information. The Standards tab provides information about their repository, PI system used, if they are certified, and a general description. Record example

16 Placing data in multiple locations Having multiple locations to place your data, called data redundancy, is important when you are conducting your research so that you don’t lose anything. Is it important when you are identifying a home for your research data after you have completed your project? That will depend on your data. Is it something that can be collected again, or is it from a one-time event? Will other researchers be interested in using your data? It doesn’t hurt to place your data in multiple locations. This helps to ensure that it will be available to future users, and it increases the likehood that others will be able to find it.

17 Consider placing your data in your institutional repository or archive Libra UVa Institutional Repository Opened in 2011 Thesis and dissertations Articles Conference paper, posters Article preprint Book Chapter in an edited collection Datasets

18 Provides/accepts data in several formats (ASCII, txt, SAS, SPSS, and Stata) Assigns a DOI (digital object identifier), links to persistent URL; facilitates data citation and location Creates variable-level DDI XML markup, makes data documentation machine-readable, allows more precise searching (e.g., by variable) Allows for online analysis of datasets Maintains secure data enclave, for archiving restricted data; accessible with permissions Data deposit form: ICPSR becomes responsible for the management, cataloging and updating of data

19 openICPSR ICPSR is creating a new repository that they will call openICPSR. It will be separate from the institutionally-funded ICPSR data repository, will require a fee to deposit the data, and will preserve for 10 years. In the Self-Deposit Package, the data depositor is responsible for the documentation & anonymization of the data.

20 Code & Multi-discipline Data Archives & Repositories Open Science Framework: Scientists can use OSF for free to archive, share, find, and register research materials and data. figshare: “figshare is a repository where users can make all of their research outputs available in a citable, shareable and discoverable manner.” GitHub: Share and archive code and collaborate with anyone. runmycode: Online repository allowing people to share and download computer code and data associated with scientific publications. ResearchGate: Online research social networking. Collaborate, share publications and data.

21 What if I can’t find a home for my data? Roche DG, Lanfear R, Binning SA, Haff TM, et al. (2014) Troubleshooting Public Data Archiving: Suggestions to Increase Participation. PLoS Biol 12(1): e doi: /journal.pbio Maybe your data doesn’t ‘fit’ in a discipline-specific repository, and you don’t feel comfortable putting it on a social network such as figshare or ResearchGate. You can host it on a website, but there is an easier and better alternative: The Dataverse Network

22 The Dataverse Network

23 DVN hosts multiple, individually-branded Dataverses Researchers control the design, content, dissemination of their Dataverse, and can embed it in their own webpage DVN assigns handles (persistent id) and Universal Numerical Fingerprint (data fixity/verification) Extracts metadata for discovery, imports/exports metadata in multiple XML formats (DDI, Dublin Core, FGDC); data across DVNs searchable within one DVN Accepts data in multiple formats (Stata, SPSS, CSV), converts to preservation format Data can be subset, recoded, analyzed online Data sharing on DVN:

24 Creating your own Dataverse Specific instructions can be found at:

25 Example of a Dataverse

26 Example of a Dataverse record

27 Data Sharing Archives & Repositories Examples

28 Questions? Bill Corey Data Management Consultant Research Data Services University of Virginia Library Thanks for attending today's workshop: Preserving and Sharing Data: Best Practices & Requirements for Selecting a Data Sharing Repository Please contact me if you have any questions, or need assistance in locating a data repository or preparing your data for sharing and archiving.

29 We’re available to help The UVa Library Research Data Services provides consulting and training services to UVA researchers and graduate students in all aspects of research data management. We can help you navigate and negotiate through the tricky issues and many approvals in order to responsibly share your research data. Contact us at Photo credit uploads/2013/10/data-mining-300x154.jpg