Download presentation
Presentation is loading. Please wait.
Published byColleen Fleming Modified over 9 years ago
1
Practical Data Management ACRL DCIG Webinar 30 April 2014 Kristin Briney, PhD
2
andrius.v, https://www.flickr.com/photos/banditaz/6823875954 (CC BY-NC-SA)
3
Mr.TinDC, https://www.flickr.com/photos/mr_t_in_dc/5940438148 (CC BY-ND)
4
International Institute of Tropical Agriculture, https://www.flickr.com/photos/iita-media-library/8160877379 (CC BY-NC) Musgo Dumio_Momio, https://www.flickr.com/photos/30976576@N07/2903662286 (CC BY-NC-SA)
5
Jen Doty and Rob O'Reilly, “Learning to Curate @ Emory”. RDAP 2014
6
Data Management Basics Introduction to a few topics in data management – File organization and naming – Documentation – Storage and backups – Future file usability
7
Data Management Basics Introduction to a few topics in data management – File organization and naming – Documentation – Storage and backups – Future file usability Teach & Use
8
For each minute of planning at beginning of a project, you will save 10 minutes of headache later
9
FILE ORGANIZATION & NAMING Dan Zen, http://www.flickr.com/photos/danzen/5551831155/ (CC BY)
10
File Organization What? – Keeping your files in order
11
File Organization Why? – Easier to find and use data – Tell, at a glance, what is done and what you have yet to do – Can still find and use files in the future
12
File Organization When? – Always! – Get in the habit of putting files in the right place
13
File Organization How? – Any system is better than none – Make your system logical for your data 80/20 Rule – Possibilities By project By analysis type By date …
14
Example Thesis – By chapter By file type (draft, figure, table, etc.) Data – By researcher By analysis type – By date
15
File Naming Conventions What? – Consistent naming for files
16
http://retractionwatch.com/2014/01/07/doing-the-right-thing-authors-retract-brain-paper-with-systematic-human-error- in-coding/
17
File Naming Conventions Why? – Make it easier to find files – Avoid duplicates – Make it easier to wrap up a project because you know which files belong to it
18
File Naming Conventions When? – For a group of related files (3 to 1000+) – May need different conventions for different groups
19
File Naming Conventions How? – Pick what is most important for your name Date Site Analysis Sample Short description
20
File Naming Conventions How? – Files should be named consistently – Files names should be descriptive but short (<25 characters) – Use underscores instead of spaces – Avoid these characters: “ / \ : * ? ‘ [ ] & $ – Use the dating convention: YYYY-MM-DD
21
Example YYYYMMDD_site_sampleNum – 20140422_PikeLake_03 – 20140424_EastLake_12 Analysis-sample-concentration – UVVis-stilbene-10mM – IR-benzene-pure
22
DOCUMENTATION Brady, https://www.flickr.com/photos/freddyfromutah/4424199420 (CC BY)
23
What would someone unfamiliar with your data need in order to find, evaluate, understand, and reuse them?
24
Documentation Why? – Data without notes are unusable – Because you won’t remember everything – For others who may need to use your files
25
Documentation When? – Always – Documentation needs will vary between files
26
Documentation How? – Take good notes – Metadata schemas http://www.dcc.ac.uk/resources/metadata-standards
27
Documentation How? – Methods Protocols Code Survey Codebook Data dictionary Anything that lets someone reproduce your results
28
Documentation How? – Templates Like structured metadata but easier Decide on a list of information before you collect data – Make sure you record all necessary details – Takes a few minutes upfront, easy to use later Print and post in prominent place or use as worksheet
29
Example I need to collect: – Date – Experiment – Scan number – Powers – Wavelengths – Concentration (or sample weight) – Calibration factors, like timing and beam size
30
Documentation How? – README.txt For digital information, address the questions – “What the heck am I looking at?” – “Where do I find X?” Use for project description in main folder Use to document conventions Use where ever you need extra clarity
31
Example Project-wide README.txt – Basic project information Title Contributors Grant info etc. – Contact information for at least one person – All locations where data live, including backups
32
Example “Talk_v1: rough outline of talk Talk_v2: draft of talk Talk_v3: updated 2014-01-15 after feedback” “ ‘Data’ folder contains all raw data files by date ‘Analysis’ has analyzed data and plots ‘Paper’ has drafts of article on this work”
33
grover_net, http://www.flickr.com/photos/9246159@N06/599820538/ (CC BY-ND) STORAGE AND BACKUPS
34
Storage Why? – Need good storage practices to prevent loss – Keep data secure
35
Storage How? – Library motto: Lots of Copies Keeps Stuff Safe! – Rule of 3: 2 onsite, 1 offsite
36
Storage How? – Computer – External hard drive – Shared drives/servers – Tape backup – Cloud storage* – CDs/DVDs – USB flash drive Erica Wheelan, https://www.flickr.com/photos/reinventedwheel/5985479866 (CC BY)
37
*Cloud Storage Read the Terms of Service! Eg. Google Drive – “When you upload or otherwise submit content to our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving our Services, and to develop new ones”
38
Backups http://toystory.disney.com/
39
Backups How? – Any backup is better than none – Automatic backup is better than manual – Your work is only as safe as your backup plan
40
Backups How? – Check your backups Backups only as good as ability to recover data Test your backups periodically – Preferably a fixed schedule – 1 or 2 times a year may be enough – Bigger/more complex backups should be checked more often Test your backup whenever you change things
41
Example I keep my data – On my computer – Backed up manually on shared drive I set a weekly reminder to do this – Backed up automatically via SpiderOak cloud storage
42
FUTURE FILE USABILITY Ian, http://www.flickr.com/photos/ian-s/2152798588/ (CC BY-NC-ND)
43
Future File Usability What? – Can you read your files from 10 years ago? – Data needs to be Accessible Interpretable Readable
44
lukasbenc, https://www.flickr.com/photos/lukasbenc/3493808772 (CC BY-NC-SA)
45
Future File Usability Why? – You may want to use the data in 5 years – PI sometimes keeps data and notes – Prep for data sharing – Per OMB Circular A-110, must retain data at least 3 years post-project Better to retain for >6 years
46
Future File Usability When? – When you wrap up a project – (As you work on a project)
47
Future File Usability How? – Back up written notes People always forget this one Difficult to interpret data without notes Options – Digitally scan (recommended with digital data) – Photocopies
48
Future File Usability How? – Convert file formats Can you open digital files from 10 years ago? Use open, non-proprietary formats that are in wide use –.docx .txt –.xlsx .csv –.jpg .tif Save a copy in the old format, just in case Preserve software if no open file format
49
Future File Usability How? – Move to new media Hardware dies and becomes obsolete – Floppy disks! Expect average lifetime to be 3-5 years Keep up with technology
50
WHERE TO GO FROM HERE
51
Center for Teaching Vanderbilt University, https://www.flickr.com/photos/vandycft/8244800868 (CC BY-NC)
52
easylocum, https://www.flickr.com/photos/easylocum/2921542814 (CC BY)
53
Chris Hoving, https://www.flickr.com/photos/pcrucifer/2433274595 (CC BY-ND)
54
Resources Data Ab Initio blog – http://dataabinitio.com/ eScience Portal – http://esciencelibrary.umassmed.edu/ DataONE Best Practices – http://www.dataone.org/best-practices
55
Steal My Slides Slides + recording available – http://connect.ala.org/node/220603 Slides available – http://www.slideshare.net/kbriney
56
Thank You! This presentation available under a Creative Commons Attribution (CC-BY) license Some content courtesy of Dorothea Salo – http://www.graduateschool.uwm.edu/research/resear cher-central/proposal-development/data-plan/boot- camp/ (CC BY)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.