Using the DMPTool for data management plans Kathleen Fear February 27, 2014
A little about me… Physics to social science – Research of my own in physics, library and information science – Embedded data management research with public health, biomedical research, botany, proteomics, archaeology, social sciences …I just really like data.
What is a DMP? A formal plan outlining how you will handle your data throughout and after your project… …which is now required by many funders… …and which is a good idea anyhow, even if it’s not required.
Goals of this session Learn how the DMPTool can help you generate a DMP Learn the basic components of a DMP Understand how good data management practices translate to a good DMP
dmptool.org
dmptool.org
Don’t know your NetID? Look it up at rvice/WhatsMyNetID.jsp Don’t know your NetID? Look it up at rvice/WhatsMyNetID.jsp
Data Products Describe the kind of data you’re collecting or using, whether it’s digital… …or physical. (or all of the above)
Data Products: What to specify What are your data products, both primary and derived? When will you collect / produce each data product? How much data will you generate?
Data and Metadata Standards Describing and organizing your data makes your work easier, and provides context for those you share with
Documenting data Data are machine readable, but must also be understandable to humans What information would someone else (or you, long in the future) need to understand the data?
Data and Metadata Standards: What to specify File formats: – Open or proprietary? If you need special software to open a file, how will you ensure its accessibility over time? – Standard or non-standard?
Data and Metadata Standards: What to specify Naming standards: – Can you tell what a file is and what it contains without opening it? How do your files relate to one another?
Data and Metadata Standards: What to specify Metadata: Contextualizing information about an object, physical or digital Some fields have defined standards; some repositories ask for a specific set of metadata
Metadata Where does it go? Lab notebook, Codebook, readme.txt, XML file
A DMP does NOT: Require that you share all data with anyone who wants it “at no more than incremental cost and within a reasonable time” (NSF) “at no more than incremental cost and within a reasonable time” (NSF) “indicate the criteria for deciding who can receive your data” (NIH) “indicate the criteria for deciding who can receive your data” (NIH)
Access and sharing: what to specify What data products will you share freely? When? How? – Data necessary for replication of public results – Other data? What data products won’t you share freely? Why not? How will you resolve ethical or privacy issues? Consider restrictions, embargo, etc. for data that can’t be immediately shared freely
Access and sharing: What to specify Backup: – Where? (and what?) Local (hard drive, dept/local server, personal laptop, flash drive) vs. distant (PDC, hard drive at home) Central (PDC, UR Research) vs. cloud (Amazon, Box, CrashPlan, Google Drive) – How often? – Who’s responsible? Security: Locked cabinets? Password-protected computer? Non-networked storage?
Access and sharing: Placing data in a repository Long-term commitment to data preservation Higher visibility for your data Permanent URL / DOI enables data citation Reuse tracking and usage statistics
Access and sharing: Placing data in a repository UR Research: – Example: STOP-ROP Clinical TrialSTOP-ROP Clinical Trial
Library-hosted 2GB soft limit Backed up, secure Free!
Access and sharing: Placing data in a repository UR Research: Repository directories: re3data.org; biosharing.orgre3data.org biosharing.org
Integration with journal submission processes Link to data held elsewhere Not free: $80/submission
Reuse and distribution Who is the audience for your data? What possible uses might someone make of your data? Are there any permissions restrictions necessary?
Plans for archiving and preservation How long should data be retained for? Where will the data be placed for long-term preservation? What policies are in place there to guarantee its preservation? How will you ensure accessibility and usability over the long term? – Data transformations? – Archiving associated information?
Revisiting Metadata and Documentation Information about data processing, collection details: the ‘story’ of the data (…but it’s all in the paper!)…but it’s all in the paper! Are your variable names meaningful? It is clear how different parts of the dataset relate to each other? Is it in a format others can use?
One size does not fit all… But we’ll cover general guidelines
A little help: UR Data Management website library.rochester.edu/data-management/goals
A little help: consultation Call me! (Or , or drop by.) Carlson 313E DMP consultation & review; trainings; data archiving support; etc.
A request When you get a grant funded, send me your DMP. If you’re comfortable, if you get negative feedback on your DMP, share it with me.
Questions?