Download presentation
Presentation is loading. Please wait.
Published byElwin Golden Modified over 8 years ago
1
Documenting and organising your data For an easier life lib.uts.edu.au utslibrary
2
Over the next 60ish mins: Why this stuff matters Metadata Tagging and file hierarchies File naming and renaming Version control
3
Documenting your data
4
So what might this be?
5
Why document? Enables you to understand/interpret data Tells the story of where the data came from Ensures informed and correct use, reduces chance of incorrect use/misinterpretation
6
What to document? Wider contextual information Data collection methodology and processes Information on dataset structure Variable-level documentation Data confidentiality, access and use conditions
7
Bad vs Good http://figshare.com/articles/Excel_database_of_th e_PhD_thesis/1360019 http://figshare.com/articles/Main_Dataset_for_Evo lution_of_Popular_Music_USA_1960_2010_/1309 953
8
Let’s get organised
9
Why? You think you’ll remember things, but over time… Multitude of formats and version of data and documentation Investment of time at the beginning can save time in the long run Good file management practices/naming protocols enable sharing with collaborators
10
Can you relate? Experimentdata.txt Laurensdata.dat Data:currentversion.dat Todaysimage.tif ReportDraft.doc ReportFinal.doc ReportFinalv2LastOne.doc ReportFinalFinal.doc
11
Some filing principles There’s no single right way to do it Establish and document a system that works for you Strike the balance between doing too much and too little: be realistic The 5 Cs: be Clear, Concise, Consistent, Correct, and Conformant
12
Hierarchical or Tag-based Hierarchical – Items are organised in folders and sub-folders Tag-based – Each item assigned one or more tags Often used in combination
13
Hierarchical filing Familiar and widely used Good at representing the structure of information – constructing the hierarchy can itself be a helpful exercise Similar items are stored together Sub-folders can function as task lists Surprisingly hard work to set up and maintain – ‘a heavyweight cognitive activity’ Can be hard to get the right balance between breadth and depth Items can only go in one place Time consuming to re-organise if the hierarchy becomes out of date The good The not so good
14
Sample folder hierarchy from the UK data archive
15
Tag-based filing Items can go in more than one category – and multiple types of category can be used Many people find tagging quicker and easier than hierarchical filing Can be easier to combine than hierarchical systems when collaborating You can search for tags in Finder and Windows explorer Not how operating systems store files If material isn’t tagged properly at first it can be hard to find later Inconsistent tagging is common Similarly named categories can get mixed Less good at representing the structure of information The good The not so good
16
Lets do Metadata Open a Word doc and choose file>information
17
File naming Important for future access and retrieval Provides contextual information Creates logical structure for skimming through many files and versions
18
How could these file names be improved?
19
Best practice for File Naming Keep file names short but meaningful Define the types of data and file formats for the research Avoid using generic file names – ie: draft, final version etc. Use underscores to differentiate between words (avoid spaces) Avoid special characters such as: & * % $ £ ] { ! @ / as these are often used for specific tasks in a digital environment Consider scalability Not all systems/software are case-sensitive and recognize capitals; so assume that TANGO, Tango and tango are the same Don’t rely on file names as your sole source of documentation
20
Possible elements Project/grant name and/or number Date of creation: useful for version control, e.g., YYYYMMDD Name of creator/investigator: last name first followed by (initials of) first name Description of content/subject descriptor Data collection method (instrument, site, etc.) Version number
21
Example of good file naming FG1_CONS_12Feb10 is the file that contains the transcript of the first focus group with a study of consumers, that took place on 12 February 2010 Int024_AP_5June08 is an interview with participant 024, interviewed by Anne Parsons on 5 June 2008
22
Naming and renaming Check to see if your instrument, software, or other equipment that outputs your data files can be set with a file naming system Less work than retrospectively changing filenames Batch renaming tools available
23
Version control Create a version control table or file history Document your convention and be consistent Record every change Put old versions in separate folder Consider discarding or deleting obsolete versions (while retaining the original 'raw' copy) if appropriate
24
Version control cont. In the file/folder names, use ordinal numbers (1,2,3, etc.) for major changes and the decimal for minor changes e.g v1, v1.1, v2.6 Beware of imprecise labels: revision, final, final2, definitive_copy - they may not be as definitive as you thought
25
Version Control Doc
26
Version Control Final Final Some software has built in version control facilities, e.g.: control rights to file editing: read/write permissions (Windows Explorer) versioning or tracking features in collaborative documents (Wikis, intranets, GoogleDocs) Consider using version control software: Guidance from MIT Libraries on software options: http://libraries.mit.edu/data- management/files/2014/05/version-control-handout.pdfhttp://libraries.mit.edu/data- management/files/2014/05/version-control-handout.pdf
27
But how will I remember all this stuff? You can use this form to plot out the structure of your own data Establishes good practice early by helping form working habits. Print out and stick on the wall above your desk!
28
Questions? David Litting david.litting@uts.edu.au Many thanks to MIT Libraries for making the excellent materials this workshop is based on available for reuse http://libraries.mit.edu/data-management/files/2014/05/file-organization-july2014.pdf lib.uts.edu.au utslibrary This work is licensed under a Creative Commons Attribution 4.0 International License.Creative Commons Attribution 4.0 International License
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.