Presentation is loading. Please wait.

Presentation is loading. Please wait.

Active, actionable DMPs

Similar presentations


Presentation on theme: "Active, actionable DMPs"— Presentation transcript:

1 Active, actionable DMPs
Sarah Jones | Digital Curation Centre | Stephanie Simms | California Digital Library | Daniel Mietchen | National Institutes of Health | Tomasz Miksa| TU Wien | Thanks to Jamie for invite to speak here today. I’m going to reflect on some work that we’ve been doing at DCC in collaboration with our partners at UC3, as well as Daniel Mietchen from NIH and Tomasz Miksa from TU Wien. DCC and UC3 have joined up over last year to converge on a single codebase (DMPRoadmap) to operate both of our tools – DMPonline and DMPTool. These are the two main DMP platforms worldwide and have thousands of users so provide a good basis for implementing and road testing active DMPs. We’ve all been thinking about how we can make the most of DMPs so information is shared between systems, and the experience of producing DMPs becomes much more valuable and returns benefits to all involved #ActiveDMPs

2 By Josiah Martin public domain
So I want to begin by thinking about the current system and how it is flawed. Most researchers see the DMP as yet another hurdle to overcome, an obstacle in the race to get grant funding. In the UK we have a very compliance heavy landscape and it’s not always clear to what extent DMPs are being monitored so it encourages researchers to just pay lip service to the requirements By Josiah Martin public domain

3 Often done at grant stage and not looked at again
Planning & administration Create, analyse, manage data Publishing & reuse DMP on periphery Often done at grant stage and not looked at again Opportunities to (re)use information being missed Disconnected & unlinked When we think about how DMPs fit into the overall research lifecycle and the various systems and processes being followed, all too often they’re on the periphery, something that is done at application stage and then not looked at again. Although funders and others talk about DMPs being actively updated, there’s little to encourage researchers to actually engage with this Fortunately there is a lot of interest in changing and improving things. Funders are aware that they gather a lot of valuable information in DMPs and are missing opportunities to use this. It could help with data discovery for example to promote reuse of the published outputs. Some funders (Wellcome / NSF) are considering pilots in this area Another key issue at the moment is that DMPs are disconnected from other systems. A lot more information could be exchanged and reused

4 Planning & administration Create, analyse, manage data
Publishing & reuse Where we hope to get to eventually is to connect up DMPs with the wide range of systems researchers are using in the course of their work, so they act as more of a hub. This includes: Research Information Management systems like Pure or Symplectic elements so administrative data from research offices can be pushed into DMPs and vice versa Funder application systems like Je-S and the EC’s participants portal. While it’s very difficult to integrate with these, APIs could hopefully exchange some information and make the process more streamlined for researchers Lab notebook systems like Jupyter Storage and data management platforms like Dataverse and OSF Journals like RIO and BMC research notes for publishing DMPs Repositories (e.g. zenodo & figshare) to deposit DMPs alongside datasets and other outputs Identifiers also play a key role to track assertions about people/organisations/grants etc, to trigger notifications and automate reporting activities. There are many different identifier systems that could be leveraged in DMPs e.g. fundref, RRID, DOIs, ORCIDs… We also need to think about how we should exchange the information across systems e.g. with APIs, defining open format/protocol/standard Identifiers:

5 Building on these thoughts, I wanted to share a couple of slides that Tomasz Miksa presented at IDCC a few weeks ago. He put forward a vision for DMPs, where all the information you need is in one place and it interacts with various tools, standards and systems.

6 Tomasz has started using the DCC Checklist for a DMP to map across the sections of a DMP and existing tools and standards

7 Kevin Dooley/ Flickr CC BY
If you’re doing work on machine-actionable DMPs or think you have something to bring, we encourage you to join up and collaborate with us. We’re engaging in a number of international fora e.g. RDA and FORCE11 to ensure there’s consensus on the way forward. We already see a lot of benefits from pooling our knowledge and bringing together people from different backgrounds and countries, so do join in and help us solve this together.

8 Utopia workshop 47 participants from 16 countries Funders Developers
Librarians Service providers Researchers Understand research workflows Develop use cases for maDMPs Set priorities for future work With those thoughts of collaboration in mind, we recently held a workshop at IDCC to gather thoughts on where we should be heading. We dubbed this our utopia workshop. We didn’t want people to be constrained by what’s feasible now or in the short-term. We really wanted people to consider what would be optimum for all stakeholders involved. We were hugely oversubscribed so couldn’t accommodate everyone at the workshop. We ended up with 47 people in total from 16 different countries so it was a very international mix. We also had representation from various stakeholder groups… We asked people to map out their activities and research workflows to see where things connect, to develop specific use cases and set priorities for future work

9 So everyone got very hands-on
So everyone got very hands-on. You can see here one of the examples of visualising workflows and connections

10 Uses cases and prioritisation
Interoperability with research systems Institutional perspective Repository use cases Evaluation & monitoring Utilising PIDs And some of the use cases and prioritisation: Interoperability across research systems was the most popular topic, with a few groups discussing connections between different platforms A number addressed institutional use cases, consider how universities could support researchers with DMPs The repository use cases again considered integrations. You can see in the photo the desired two way exchange of info (DMPs alerting repos to data, and repos pushing info on datasets and DOIs back into DMPs) Evaluation and monitoring was a topic of interest, particularly among funders And the PID group also generated a lot of tangible starting points

11 maDMP priority areas Common standards and protocols Funder integration
Share/publish/deposit DMPs Utilise PIDs for automatic reporting Capacity planning (institutional & data centre) Automated compliance checks We’re writing up full notes from the workshop which we expect to release to participants in the next week and then publish in RIOjournal as a white paper. To give you a sneak preview, a number of priority areas came out across the groups: All stakeholders expressed a need for common standards as a foundation to enable information to flow between plans and systems. It’s a top priority to define a minimum data model with a core set of elements for DMPs. It could potentially be based on an existing template structure or DMPRoadmap themes Given that funders often drive demand for DMPs, there’s a lot of demand to integrate with their systems on some level. Basic interoperation to support grant submission, monitoring and reporting would help. An increasing number of researchers are publishing DMPs and we want to support open practices. Connections with journals or repositories would help here. PIDs could be used in several ways to pull in relevant data and also help automate reporting Harvesting information on data volumes was critical in an institutional context and for domain data centres so they can do capacity planning exercises and ensure support is costed in Some automation of compliance checks is also desired e.g. checking that data has been deposited in the named repository

12 The problem of freetext
So I now want to tease into some of the issues in more detail and give you a sense of some work that is underway, others things that are planned or activities we hope to do. One of the primary issues we face in terms of machine-actionability is the format that DMPs are in. They are typically free-form text documents as funders and others tend to ask very broad questions, even when a limited range of options could be provided. Templates typically ask very broad questions, even when dropdown options are feasible (e.g. metadata standards, file formats, data volumes, repositories, licences…)

13 Plugins to give structured response
One things we plan to do is draw in external resources such as the RDA Metadata Standards Directory and Biosharing. This will have a number of benefits: it will help direct researchers to relevant options as they’re not always aware of good practice, and it will mean that some more structured responses are provided for future action. There are a number of databases e.g. re3data, or wizards like the EUDAT licencing tool that could be used in this way

14 Define a minimum data model
DMPRoadmap themes Mappings Common format? One of the key priorities coming out of the workshop is to define a minimum data model for DMPs The DCC did some work a few years back to analyse requirements and good practice to derive a basic checklist for a DMP and common themes. We’ve since revised these themes in collaboration with UC3 to test that they meet the US and UK landscapes. Currently the themes are used to associate questions and guidance within the tool, but we want to explore the potential for basic tagging (e.g. for text mining sections of DMPs) We are also working with repositories to test mapping themes to other vocabularies e.g. DDI working group for social science repos These themes may also form the basis of a common format From Flickr by Steve Johnson, CC BY 2.0

15 New text editor Substance Forms: http://substance.io
We are introducing a new text editor into the DMPRoadmap platform (the codebase from which we will operate DMPonline and DMPTool) Substance Forms is a Javascript library which will provide a simpler, cleaner way to write plans and annotate text with comments In future, we hope the editor can also support us to mine the text for expected terms e.g. repository names

16 Repository use cases I. Repository recommender service via re3data.org
II. Text mine to ping repositories when mentioned in a DMP III. Use DMP as metadata to facilitate deposit process IV. Deposit DMPs with data To move on to some of the specific use cases, there are a number related to repos… Ainsley Seago. CC BY 4.0

17 Institutional use cases
Connect researchers to relevant services & support Gather information to forecast demand and do capacity planning Embed DMP in research process (domain workflows, ethics, admin systems) Institutions were keen to use the DMP process to connect researchers up with relevant services & support. They see the DMP as a good awareness raising and training tool. Capacity planning is also a key use case for unis, specifically to ensure costs are written into proposals And there’s a desire to link up across various systems in use as noted in the introductory slides. Some institutions like Purdue University are keen to be institutional pilot and map out all the information flows across different research systems so we can have a picture of a whole university as an example. William Murphy. CC BY-SA 4.0

18 Persistent identifiers (PIDs)
Assign DOIs to releases of DMP versions Leverage other PIDs to populate DMP over time: Researcher IDs (ORCIDs) Funder IDs (FundRef) Grant IDs Research Resource IDs (RRIDs) antibodies, organisms, cell lines Also enables compliance monitoring Add IDs to the DMP, forward recognition = living, updatable document Make the point that there are many kinds of IDs, can leverage them in various actionable ways Can be about assigning DOIs to DMPs and/or using DOIs in DMPs. Goal to enable tracking of how commitments made to funders are followed through into actions to publish/deposit outputs. Also note Stephanie’s presentation from PIDapalooza

19 ORCID integration We have done an ORCID integration so users can connect up their DMP profile. We hope to do more with this e.g. login with ORCID and use it to bring in other relevant info

20 Utilising EC grant IDs in plans
Harvest grant IDs from OpenAIRE API Provide look up when entering project details Enables join up of DMP with other outputs Grant IDs can also be pulled into DMPs. For the EC for example, we could use the OpenAIRE API to harvest these and provide a look up when researchers are creating their plan at month 6. This also allows us to connect up the DMP with other outputs to assist in reporting

21 Evaluation & monitoring
Automated compliance checks did researchers do what they said they would? Quality or validation checks closed questions / range of defined options training and evaluation rubrics evaluate FAIRness of data and repository… In terms of evaluation and monitoring, funders are particularly interested in automated compliance checks, and also support for plan review. More closed questions or predefined options would help in assessment, as well as basic tools like evaluation rubrics and training to support project officers. The other aspect mentioned in terms of H2020 was an evaluation of the FAIRness of data and repositories, which is a nice handover to Ingrid who’s up next…

22 Next steps: maDMP pilots
Our next steps are to develop some of the use cases and test these out in the roadmap platform. Everyone from the workshop wanted to be a pilot user so we have lots of volunteers. The only limiting factor will be our bandwidth to run them all! From Flickr by Allen, CC BY 2.0

23 Summary Think of DMPs as key elements of a networked data management ecosystem: connected via a shared vocabulary actionable by humans and software versioned public Summary of the overall vision Quick note about promoting greater openness and public sharing of DMPs: DMPTool Public DMPs list DMP Collection in RIO Journal Depositing DMPs in Zenodo, Dataverse and other IRs From Flickr by highwaysengland, CC BY 2.0

24 Join us for more! Thurs 6th April, Plenary in BCN, Active DMP IG session Also plan to continue this work at the Barcelona plenary session in April – join us there!


Download ppt "Active, actionable DMPs"

Similar presentations


Ads by Google