Update on Data Publishing With Dataverse

Slides:

Advertisements

Similar presentations

PROMOTION ORDERING DIFM Promotions February 29, 2008.

Advertisements

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…

Advanced Piloting Cruise Plot.

Chapter 1 The Study of Body Function Image PowerPoint

Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 5 Author: Julia Richards and R. Scott Hawley.

1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.

1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.

By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.

11 Application of CSF4 in Avian Flu Grid: Meta-scheduler CSF4. Lab of Grid Computing and Network Security Jilin University, Changchun, China Hongliang.

Mirror Mirror on the wall does your repository reflect it all? Peter West and Timothy Miles-Board EPrints Services University of Southampton Southampton,

Suzanne Bell and Nathan Sarr University of Rochester River Campus Libraries Re-engineering the Institutional Repository to Engage Users.

National Diet Library Digital Archive Portal - PORTA - Gateway to digital information in Japan April 3, 2008 Hideki Takeuchi Planning.

Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Title Subtitle.

DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.

FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.

Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.

Pure Silver Reusing and Repurposing Bibliographic Data in a Current Research Information System and Institutional Repository 15 September.

©2011 Quest Software, Inc. All rights reserved.. Andrei Polevoi, Tatiana Golubovich Program Management Group ActiveRoles Add-on Manager Overview.

How To Use OPAC.

Niagara Portal Introduction January 2007 Scott Muench - Technical Sales Manager.

1 of Audience Survey Results Larry D. Gustke, Ph.D. – October 5, 2013.

1 The phone in the cloud Utilizing resources hosted anywhere Claes Nilsson.

ETS4 - What's new? - How to start? - Any questions?

ABC Technology Project

Creating a WordPress Website Oklahoma Conference of The UMC Department of Communications 1.

In The Name Of Allah, The Most Beneficent, The Most Merciful

CAR Training Module PRODUCT REGISTRATION and MANAGEMENT Module 2 - Register a New Document - Without Alternate Formats (Run as a PowerPoint show)

“Start-to-End” Simulations Imaging of Single Molecules at the European XFEL Igor Zagorodnov S2E Meeting DESY 10. February 2014.

©2007 First Wave Consulting, LLC A better way to do business. Period This is definitely NOT your father’s standard operating procedure.

Squares and Square Root WALK. Solve each problem REVIEW:

Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.

Mobility Tool Fremtidens afrapportering 2013 – Erasmus Mobilitet / IP 2014 – Erasmus+ aktioner.

© 2012 National Heart Foundation of Australia. Slide 2.

Helping Journals to Upgrade Data Publications for Reusable Research Sonia Barbosa (Project Manager) Eleni Castro (Project Coordinator) Institute for Quantitative.

Chapter 5 Test Review Sections 5-1 through 5-4.

SIMOCODE-DP Software.

GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.

Welcome! A powerpoint guide to IOP’s Electronic Journals Contents journals.iop.org 2 journals.iop.org Journals list 3Journals list Journal homepages 4Journal.

Addition 1’s to 20.

25 seconds left…...

Services Course Windows Live SkyDrive Participant Guide.

1 Atlantic Annual Viewing Trends Adults 35-54, Total TV, By Daypart Average Minute Audience (000) Average Weekly Reach (%) Average Weekly Hours Viewed.

DIKLA GRUTMAN 2014 Databases- presentation and training.

We will resume in: 25 Minutes.

Metadata and presentation issues of Korean E-Resources relating to access and discovery ERMB Workshop presented by Erica Chang March 25, 2014 Philadelphia,

A SMALL TRUTH TO MAKE LIFE 100%

1 Unit 1 Kinematics Chapter 1 Day

1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.

- 1 - Defense Security Service Background: During the Fall of 2012 Defense Security Service will be integrating ISFD with the Identity Management (IdM)

Multi Contents Registration Manual National Institute of Informatics

How Cells Obtain Energy from Food

Excel Lesson 17 Importing and Exporting Data Microsoft Office 2010 Advanced Cable / Morrison 1.

OJS + Dataverse: Integrated & Transparent Publishing Workflows

Towards an Integrated Transparent Journal Publishing Workflow

DATAVERSE FOR JOURNALS Mercè Crosas, Ph.D. Director of Data Science IQSS, Harvard Society for Scholarly Publishing 37 th Meeting,

Integrating Automated Data Deposit into the Journal Publishing Workflow Eleni Castro, Research Coordinator > IQSS Harvard Digital Publishing Collaborative,

Data Publishing with Dataverse Mercè Crosas, Ph.D. Director of Data Science Institute for Quantitative Social Science, Harvard University.

Data Citation in The Dataverse Network ® Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for the Board on Research.

Data Citation Implementation Pilot Workshop

Data Citation Dataverse Mercè Crosas Chief Data Science and Technology Officer, IQSS, Harvard Workshop: Data Citation.

An Overview of Data-PASS Shared Catalog

Dataverse for citing and sharing research data

Presentation transcript:

Update on Data Publishing With Dataverse Intro to Dataverse and Data Science Team Options (3) for Journals to publish data associated with their articles OJS integration project Rigorous Data Publishing Workflows with 4.0 (Versioning) 4.0 Deaccession only once published w/ a reason (cannot delete). Always a landing page with a DOI Publishing Privacy Sensitive Data with Dataverse Eleni Castro, Research Coordinator Institute for Quantitative Social Science (IQSS) Harvard University DataCite Annual 2014 Nancy, France August 25, 2014

Introduction to Dataverse Software framework for publishing, citing and preserving research data (open source on github for others to install) Provides incentives for researchers to share: Recognition & credit via data citations Control over data & branding Fulfill Data Management Plan requirements Harvard Dataverse (open to all; repository instance at Harvard) currently has: 761 Dataverses > 1 Million Downloads 54,828 Datasets EZID DOI (2013) 748,554 Files

Who’s Using Dataverse? Worldwide Dataverse Installations Institutions can setup & host their own Dataverse installation (e.g., Odum, OCUL, DANS, Fudan, etc) and within them can support datasets from a variety of users (across all research domains): Researchers, Projects, Departments, Journals, etc.

Journals Publishing Data w/ Dataverse Option A. Journals include Dataverse as a Recommended Repository Option B. Authors Contribute Directly to a Journal Dataverse Option B. Step 3: This can be done at the same time as the journal article is being reviewed. Option C. Seamless Integration btw Journal + Dataverse (e.g., OJS)

OJS-Dataverse Integration OJS Journal Journal Dataverse Citation to Data Citation to Article Details/Updates: 2 Year Project 2012-2014 Integrating w/ PKP’s Open Journal Systems (Data Deposit API). Pilot with ~ 50 journals + expanding outreach (100s) . OJS’ Dataverse plugin now available with latest OJS release. Future: Embed Dataverse widgets into journal article. http://projects.iq.harvard.edu/ojs-dvn

OJS Plugin: Journal Data Policies Boilerplate Templates Including Guidelines for: Authors (w/ data citation) Reviewers Boilerplate policies for authors to deposit and cite data in OJS Dataverse plugin. Includes boiler plate for Reviewers and copyeditors. Read full Data Policies / Guidelines Template: http://bit.ly/1xkLjoZ

OJS Plugin: Author Manuscript + Data Submission Joint Declaration of Data Citation principles #3 In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited Option to: (A) deposit into Dataverse AND/OR; (B) if data is already in a repository can include the data citation (w/ persistent URL/identifier).

OJS Plugin: Editor Reviews Article + Data Joint Declaration of Data Citation principles #3 In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited

Data Published in Dataverse w/ OJS Plugin In OJS: In Dataverse: 2 Options in OJS: 1) Dataset Published (with DOI) at Article Approval. 2) Dataset Published when Journal Issue is Released.

OJS Plugin: Article Published w/ Data Citation Now that the Data Citation is listed on the same page as the article it helps it go one step closer to data citation principle #1 Importance: Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications

Towards An Integrated Publishing Lifecycle See: Data Citation Principle #1 Importance This is really a reference implementation that we hope will inspire other repositories and journal management / publishing systems to use our open source code to expand and extend the possibilities of data publishing. Its one step in the right direction to bringing data closer to the same level of importance as the published article. Image Credit: Mercè Crosas

Publishing in 4.0 (Late Fall 2014)

Rigorous Data Publishing Workflows Upload Draft Dataset See Data Citation Principle #7 Specificity & Verifiability Published Dataset v1 Publish Version 1 Authors, Title, Year, DOI, Repository, V1 Published Dataset v1.1 Publish Version 1.1: small metadata change; citation doesn’t change. Compliant w/ Joint Declaration of Data Citation Principles and based on the paper by Altman and King, 2007 were pushing for machine readable information to be added to data citations (UNF, persistent identifier). Principle #7 Specificity & Verifiability Data citations should facilitate identification of, access to, and verification of the specific data that support a claim. Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verfiying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the same as was originally cited Publish Version 2: File change (automatic); big metadata change (e.g., author, title). Published Dataset v2 Authors, Title, Year, DOI, Repository, UNF, V2 See: Altman, M., & King, G. (2007) doi:10.1045/march2007-altman

Dataset Versioning (1) If you don’t add or change any files and havent changed any default metadata fields (eg, author, title) you will get the option to publish the data citation as a minor or major version

Dataset Versioning (2) If you add or change files the version will go up a major version (ex. V1 to V2)

Dataset Versioning (3) Ex. Added files to a Dataset so it bumped up to a major version change. If you add or change files the version will go up a major version (ex. V1 to V2). And here is a screenshot of how you would see the version differences when changing/adding files.

Dataset Versioning (4) Ex. Added small metadata change to a Dataset so it bumped up to a minor version change. If you add or change files the version will go up a major version (ex. V1 to V2)

Deaccession Data in 4.0 Before a Dataset is published the DOI is private (reserved). Only when published is it made public & searchable. In accordance w/ Data Citation Principle #6 Persistence: A Published Dataset cannot be deleted; only deaccessioned, with a reason. You can Deaccession (in 4.0): a version(s) of a Dataset, or an entire Dataset. And now that we support more granular levels of versioning we are also working on improving dataset deaccession. Prior to 4.0 you could delete a published a dataset which goes completely against data citation best practices of Persistence (#6)

Deaccession Workflow (Step 1) Ex. This file was added in v2 and has identifiable information. To try out deaccessioning, go to a dataset you’ve already published (or add a new one and publish it), click on Edit Dataset, then Deaccession Dataset. If you have multiple versions of a dataset, you can select here which versions you want to deaccession or choose to deaccession the entire dataset.an entire dataset.

Deaccession Workflow (Step 2) You are then presented with the option to deaccession the entire dataset or just certain version(s) with a drop down list of "Reasons for deaccession". Are there any important reasons to deaccession that we might be missing here?

Deaccession Workflow (Step 3) Screenshot 3: Once you select a reason you can enter additional information. This is particularly important if you select Other. If the Dataset has moved then a URL may also be entered (we validate URLs).

Deaccession Workflow (Step 4) Deaccession Landing Page Data Citation Principle #6 Persistence A persistent landing page, that includes very limited metadata (see below), will always be accessible to the public if they use the DOI provided in the citation for that dataset. In accordance with Data Citation Principle #6 Persistence: Unique identifiers, and metadata describing the data, and its disposition, should persist -- even beyond the lifespan of the data they describe

Data Publishing After 4.0 (2015) Publishing Privacy Sensitive Data Secure Dataverse DataTags (demo) (based on Privacy Laws and DUAs) In the current version of Dataverse, identifiable data cannot be deposited safely. Based on the data publication need for sufficient information to understand and reuse the data (metadata, documentation, code) after 4.0 users will be able to safely upload datasets that have identifiable information, while a minimal risk version of the data is automatically rendered (synthetic data, differential privacy, statistical disclosure control (SDC), etc) and can be made available for the public. Full interview Integration with ORCID (API): create ORCID account, connect all Dataverse datasets to ORCID account. (Note: 4.0 will already allow for authors to enter ID.)

Thank you! Contact: ecastro@fas.harvard.edu More information: http://datascience.iq.harvard.edu Twitter: @thedataorg