Active, actionable DMPs

Slides:



Advertisements
Similar presentations
Enrich: Repository and Research System Integration William J Nixon Enrich Project Manager, University of Glasgow.
Advertisements

Account Planning The purpose of these slides is to describe the Account Planning Process, the methodology, and the workload involved in running an account.
… because good research needs good data DMP Online, Lincoln, 28 th Feb 2013 DMP Online Kerry Miller Digital Curation Centre University of Edinburgh
Enlighten: integrating a repository with University systems and processes Morag Greig Advocacy Manager- Enlighten University of Glasgow UKCoRR meeting.
Magdi Latif Regional Knowledge and Information Management Officer FAO Partnership, Advocacy and Capacity Development Division FAORNE Jordan Plant Genetic.
Data Management Development and Implementation: an example from the UK SLA Conference, Boston, June 2015 Geraldine Clement-Stoneham Knowledge and Information.
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
… because good research needs good data PEKin: Developing Data Management Expertise in Research, 21 October 2010 The DCC’s Data Management Planning: Encouraging.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
Hydra Europe Symposium | April 2015 | 1 Hydra and open access Chris Awre Hydra Europe Symposium London School of Economics, 24 th April 2015.
Options for customising DMPonline Sarah Jones Digital Curation Centre, Glasgow DMPonline workshop, 9-10 November.
Orcid.org ORCID adoption in research evaluation workflow ARMS2015, Singapore, 02 Oct 2015 Nobuko Miyairi Regional Director, Asia Pacific
Data Citation Implementation Pilot Workshop
RDA-WDS Publishing Data IG Data Bibliometrics Working Group.
Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.
+ Building a Community of Practice for Research Data Services Experience of CLIR/DLF E-Research Peer Network & Mentoring Group Presentation for DLF Forum.
Data Management Planning Joy Davidson
School on Grid & Cloud Computing International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics.
Beyond the Repository: Research Systems, REF & New Opportunities William J Nixon Digital Library Development Manager.
Data Management Plans as Infrastructure: The Planning or the Plan? Kevin Ashley Digital Curation Centre
Where next? Practical steps for you to move forward with DMPs
Current and Upcoming RDA Recommendations Dr. ir. Herman Stehouwer
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Designing a better future: Active, actionable DMPs
Auditing Sustainable Development Goals
Specifying Repository Requirements
Paolo Budroni, University of Vienna
Building the foundations for innovation
An Open Knowledge & Research Information Infrastructure
Summit 2017 Breakout Group 2: Data Management (DM)
School Library Services 21
ACS 2016 Moving research forward with persistent identifiers
Machine-actionable DMPs: a review of recent work
Implementing the NHS KSF Action Planning and Surgery Session
Exposing DMPs Working Group
Institutional role in supporting open access, open science, open data
Integrated Open Access (OA) Service Mick Eadie, Research Information Officer Valerie McCutcheon, Research Information
Linking persistent identifiers at the British Library
Access  Discovery  Compliance  Identification  Preservation
Sequencing Writing Assignments
recommendations for potential pilot network(s)
Sequencing Writing Assignments
Module 6: Preparing for RDA ...
Open Access to your Research Papers and Data
The Q Improvement Lab August 2017.
EOSCpilot Skills Landscape & Framework
Digital Curation Centre, Glasgow
EOSCpilot All Hands Meeting 9 March 2018, Pisa
Next-generation DMPs Stephanie Simms Sarah Jones
Creating a Culture of Open Data in Academia
Tomasz Miksa, SBA Research, Wien, Austria
Mission SHARE is a higher education initiative whose mission is to maximize research impact by making research widely accessible, discoverable, and reusable.
Archives and Records Professionals for Research Data IG
Research Data Management
Repository Platforms for Research Data Interest Group: Requirements, Gaps, Capabilities, and Progress Robert R. Downs1, 1 NASA.
Using a CRIS to support communication of research: mapping the publication cycle to deposit workflows for data and publications Federica Fina, Data Scientist,
Learning loves company
The Role of Implementing Partners in Measurement of OVC Programs
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
EOSCpilot All Hands Meeting 9 March 2018, Pisa
Jisc Research Data Shared Service (RDSS)
Bird of Feather Session
Angus Whyte P-11 Berlin, 22 March 2018
7. EAFM cycle overview Essential EAFM Date • Place 1.
Building Strong Partnerships
Implications of openly licenced resources for librarians
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Supporting Open Research
EOSC-hub Contribution to the EOSC WGs
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

Active, actionable DMPs Sarah Jones | Digital Curation Centre | sarah.jones@glasgow.ac.uk Stephanie Simms | California Digital Library | stephanie.simms@ucop.edu Daniel Mietchen | National Institutes of Health | daniel.mietchen@nih.gov Tomasz Miksa| TU Wien | tmiksa@sba-research.org Thanks to Jamie for invite to speak here today. I’m going to reflect on some work that we’ve been doing at DCC in collaboration with our partners at UC3, as well as Daniel Mietchen from NIH and Tomasz Miksa from TU Wien. DCC and UC3 have joined up over last year to converge on a single codebase (DMPRoadmap) to operate both of our tools – DMPonline and DMPTool. These are the two main DMP platforms worldwide and have thousands of users so provide a good basis for implementing and road testing active DMPs. We’ve all been thinking about how we can make the most of DMPs so information is shared between systems, and the experience of producing DMPs becomes much more valuable and returns benefits to all involved #ActiveDMPs

By Josiah Martin public domain So I want to begin by thinking about the current system and how it is flawed. Most researchers see the DMP as yet another hurdle to overcome, an obstacle in the race to get grant funding. In the UK we have a very compliance heavy landscape and it’s not always clear to what extent DMPs are being monitored so it encourages researchers to just pay lip service to the requirements By Josiah Martin public domain

Often done at grant stage and not looked at again Planning & administration Create, analyse, manage data Publishing & reuse DMP on periphery Often done at grant stage and not looked at again Opportunities to (re)use information being missed Disconnected & unlinked When we think about how DMPs fit into the overall research lifecycle and the various systems and processes being followed, all too often they’re on the periphery, something that is done at application stage and then not looked at again. Although funders and others talk about DMPs being actively updated, there’s little to encourage researchers to actually engage with this Fortunately there is a lot of interest in changing and improving things. Funders are aware that they gather a lot of valuable information in DMPs and are missing opportunities to use this. It could help with data discovery for example to promote reuse of the published outputs. Some funders (Wellcome / NSF) are considering pilots in this area Another key issue at the moment is that DMPs are disconnected from other systems. A lot more information could be exchanged and reused

Planning & administration Create, analyse, manage data Publishing & reuse Where we hope to get to eventually is to connect up DMPs with the wide range of systems researchers are using in the course of their work, so they act as more of a hub. This includes: Research Information Management systems like Pure or Symplectic elements so administrative data from research offices can be pushed into DMPs and vice versa Funder application systems like Je-S and the EC’s participants portal. While it’s very difficult to integrate with these, APIs could hopefully exchange some information and make the process more streamlined for researchers Lab notebook systems like Jupyter Storage and data management platforms like Dataverse and OSF Journals like RIO and BMC research notes for publishing DMPs Repositories (e.g. zenodo & figshare) to deposit DMPs alongside datasets and other outputs Identifiers also play a key role to track assertions about people/organisations/grants etc, to trigger notifications and automate reporting activities. There are many different identifier systems that could be leveraged in DMPs e.g. fundref, RRID, DOIs, ORCIDs… We also need to think about how we should exchange the information across systems e.g. with APIs, defining open format/protocol/standard Identifiers:

Building on these thoughts, I wanted to share a couple of slides that Tomasz Miksa presented at IDCC a few weeks ago. He put forward a vision for DMPs, where all the information you need is in one place and it interacts with various tools, standards and systems.

Tomasz has started using the DCC Checklist for a DMP to map across the sections of a DMP and existing tools and standards

Kevin Dooley/ Flickr CC BY If you’re doing work on machine-actionable DMPs or think you have something to bring, we encourage you to join up and collaborate with us. We’re engaging in a number of international fora e.g. RDA and FORCE11 to ensure there’s consensus on the way forward. We already see a lot of benefits from pooling our knowledge and bringing together people from different backgrounds and countries, so do join in and help us solve this together.

Utopia workshop 47 participants from 16 countries Funders Developers Librarians Service providers Researchers Understand research workflows Develop use cases for maDMPs Set priorities for future work www.dcc.ac.uk/events/workshops/postcard-future-tools-and-services-perfect-dmp-world With those thoughts of collaboration in mind, we recently held a workshop at IDCC to gather thoughts on where we should be heading. We dubbed this our utopia workshop. We didn’t want people to be constrained by what’s feasible now or in the short-term. We really wanted people to consider what would be optimum for all stakeholders involved. We were hugely oversubscribed so couldn’t accommodate everyone at the workshop. We ended up with 47 people in total from 16 different countries so it was a very international mix. We also had representation from various stakeholder groups… We asked people to map out their activities and research workflows to see where things connect, to develop specific use cases and set priorities for future work

So everyone got very hands-on So everyone got very hands-on. You can see here one of the examples of visualising workflows and connections

Uses cases and prioritisation Interoperability with research systems Institutional perspective Repository use cases Evaluation & monitoring Utilising PIDs And some of the use cases and prioritisation: Interoperability across research systems was the most popular topic, with a few groups discussing connections between different platforms A number addressed institutional use cases, consider how universities could support researchers with DMPs The repository use cases again considered integrations. You can see in the photo the desired two way exchange of info (DMPs alerting repos to data, and repos pushing info on datasets and DOIs back into DMPs) Evaluation and monitoring was a topic of interest, particularly among funders And the PID group also generated a lot of tangible starting points

maDMP priority areas Common standards and protocols Funder integration Share/publish/deposit DMPs Utilise PIDs for automatic reporting Capacity planning (institutional & data centre) Automated compliance checks We’re writing up full notes from the workshop which we expect to release to participants in the next week and then publish in RIOjournal as a white paper. To give you a sneak preview, a number of priority areas came out across the groups: All stakeholders expressed a need for common standards as a foundation to enable information to flow between plans and systems. It’s a top priority to define a minimum data model with a core set of elements for DMPs. It could potentially be based on an existing template structure or DMPRoadmap themes Given that funders often drive demand for DMPs, there’s a lot of demand to integrate with their systems on some level. Basic interoperation to support grant submission, monitoring and reporting would help. An increasing number of researchers are publishing DMPs and we want to support open practices. Connections with journals or repositories would help here. PIDs could be used in several ways to pull in relevant data and also help automate reporting Harvesting information on data volumes was critical in an institutional context and for domain data centres so they can do capacity planning exercises and ensure support is costed in Some automation of compliance checks is also desired e.g. checking that data has been deposited in the named repository

The problem of freetext So I now want to tease into some of the issues in more detail and give you a sense of some work that is underway, others things that are planned or activities we hope to do. One of the primary issues we face in terms of machine-actionability is the format that DMPs are in. They are typically free-form text documents as funders and others tend to ask very broad questions, even when a limited range of options could be provided. Templates typically ask very broad questions, even when dropdown options are feasible (e.g. metadata standards, file formats, data volumes, repositories, licences…)

Plugins to give structured response One things we plan to do is draw in external resources such as the RDA Metadata Standards Directory and Biosharing. This will have a number of benefits: it will help direct researchers to relevant options as they’re not always aware of good practice, and it will mean that some more structured responses are provided for future action. There are a number of databases e.g. re3data, or wizards like the EUDAT licencing tool that could be used in this way http://rd-alliance.github.io/metadata-directory

Define a minimum data model DMPRoadmap themes Mappings Common format? One of the key priorities coming out of the workshop is to define a minimum data model for DMPs The DCC did some work a few years back to analyse requirements and good practice to derive a basic checklist for a DMP and common themes. We’ve since revised these themes in collaboration with UC3 to test that they meet the US and UK landscapes. Currently the themes are used to associate questions and guidance within the tool, but we want to explore the potential for basic tagging (e.g. for text mining sections of DMPs) We are also working with repositories to test mapping themes to other vocabularies e.g. DDI working group for social science repos These themes may also form the basis of a common format From Flickr by Steve Johnson, CC BY 2.0

New text editor Substance Forms: http://substance.io We are introducing a new text editor into the DMPRoadmap platform (the codebase from which we will operate DMPonline and DMPTool) Substance Forms is a Javascript library which will provide a simpler, cleaner way to write plans and annotate text with comments In future, we hope the editor can also support us to mine the text for expected terms e.g. repository names

Repository use cases I. Repository recommender service via re3data.org II. Text mine to ping repositories when mentioned in a DMP III. Use DMP as metadata to facilitate deposit process IV. Deposit DMPs with data To move on to some of the specific use cases, there are a number related to repos… Ainsley Seago. CC BY 4.0

Institutional use cases Connect researchers to relevant services & support Gather information to forecast demand and do capacity planning Embed DMP in research process (domain workflows, ethics, admin systems) Institutions were keen to use the DMP process to connect researchers up with relevant services & support. They see the DMP as a good awareness raising and training tool. Capacity planning is also a key use case for unis, specifically to ensure costs are written into proposals And there’s a desire to link up across various systems in use as noted in the introductory slides. Some institutions like Purdue University are keen to be institutional pilot and map out all the information flows across different research systems so we can have a picture of a whole university as an example. William Murphy. CC BY-SA 4.0

Persistent identifiers (PIDs) Assign DOIs to releases of DMP versions Leverage other PIDs to populate DMP over time: Researcher IDs (ORCIDs) Funder IDs (FundRef) Grant IDs Research Resource IDs (RRIDs) antibodies, organisms, cell lines Also enables compliance monitoring Add IDs to the DMP, forward recognition = living, updatable document Make the point that there are many kinds of IDs, can leverage them in various actionable ways Can be about assigning DOIs to DMPs and/or using DOIs in DMPs. Goal to enable tracking of how commitments made to funders are followed through into actions to publish/deposit outputs. Also note Stephanie’s presentation from PIDapalooza http://pidapalooza.org

ORCID integration We have done an ORCID integration so users can connect up their DMP profile. We hope to do more with this e.g. login with ORCID and use it to bring in other relevant info

Utilising EC grant IDs in plans Harvest grant IDs from OpenAIRE API Provide look up when entering project details Enables join up of DMP with other outputs Grant IDs can also be pulled into DMPs. For the EC for example, we could use the OpenAIRE API to harvest these and provide a look up when researchers are creating their plan at month 6. This also allows us to connect up the DMP with other outputs to assist in reporting

Evaluation & monitoring Automated compliance checks did researchers do what they said they would? Quality or validation checks closed questions / range of defined options training and evaluation rubrics evaluate FAIRness of data and repository… In terms of evaluation and monitoring, funders are particularly interested in automated compliance checks, and also support for plan review. More closed questions or predefined options would help in assessment, as well as basic tools like evaluation rubrics and training to support project officers. The other aspect mentioned in terms of H2020 was an evaluation of the FAIRness of data and repositories, which is a nice handover to Ingrid who’s up next…

Next steps: maDMP pilots Our next steps are to develop some of the use cases and test these out in the roadmap platform. Everyone from the workshop wanted to be a pilot user so we have lots of volunteers. The only limiting factor will be our bandwidth to run them all! From Flickr by Allen, CC BY 2.0

Summary Think of DMPs as key elements of a networked data management ecosystem: connected via a shared vocabulary actionable by humans and software versioned public Summary of the overall vision Quick note about promoting greater openness and public sharing of DMPs: DMPTool Public DMPs list DMP Collection in RIO Journal Depositing DMPs in Zenodo, Dataverse and other IRs From Flickr by highwaysengland, CC BY 2.0

Join us for more! Thurs 6th April, 9:30-11:00, @RDA Plenary in BCN, Active DMP IG session Also plan to continue this work at the Barcelona plenary session in April – join us there!