Organising and Documenting Data Stuart Macdonald EDINA & Data Library DIY Research Data Management Training Kit for Librarians.

Slides:



Advertisements
Similar presentations
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
Advertisements

Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
IUFRO International Union of Forest Research Organizations Eero Mikkola The Increasing Importance of Metadata in Forest Information Gathering NEFIS Symposium.
Organizing Shared Drive
A centre of expertise in digital information management A QA Framework To Support Your Library Web Site Review Brian Kelly UKOLN University of Bath Bath.
Configuration Management
Intro to Version Control Have you ever …? Had an application crash and lose ALL of your work Made changes to a file for the worse and wished you could.
Module 5a: Authority Control and Encoding Schemes IMT530: Organization of Information Resources Winter 2007 Michael Crandall.
Documenting the Resource Malcolm Polfreman
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
MODULE 4 File and Folder Management. Creating file and folder A computer file is a resource for storing information, which is available to a computer.
Versioning Requirements and Proposed Solutions CM Jones, JE Brace, PL Cave & DR Puplett OR nd April
Dr Gordon Russell, Napier University Unit Data Dictionary 1 Data Dictionary Unit 5.3.
1 Planning And Electronic Records Issues For Electronically Enhanced Courses Jeremy Rowe Nancy Tribbensee
Improving access to digital resources: a mandate for order mandate: managing digital assets in tertiary education craig green,
Data Preservation Best Practices for preserving your research data for future reuse The goal of data preservation is to ensure that your data is in a sustainable.
NEES Central Goran Josipovic IT Manager
Instructions and forms
August 14, 2015 Research data management – an introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
Agenda Overview 2.What is SharePoint? 3.NCDOT Websites 4.Roles 5.Search 6.SharePoint Interface.
Today’s Lecture application controls audit methodology.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Managing the Retention of Electronic Records Ann Marie Przybyla Electronic Records Symposium Region 9, November 2007.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Elements of a Data Management Plan: Roles and Responsibilities Ruth Duerr National Snow and Ice Data Center Version 1.0 Review Date.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
June 3, 2016 Research data management – an introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
PACSCL Consortial Survey Initiative Group Training Session February 12, 2008 at The Historical Society of Pennsylvania.
Introduction to metadata
IT Auditing & Assurance, 2e, Hall & Singleton Chapter 8: IT Auditing & Assurance, 2e, Hall & Singleton CAATTs for Data Extraction and Analysis.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
Copyright © Software Carpentry 2011 This work is licensed under the Creative Commons Attribution License See
Department of Industrial Engineering Sharif University of Technology Session# 9.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
How Not to Lose Track of Your Research Organization and Planning Resources at Brandeis Melanie Radik and Raphael Fennimore Library & Technology Services.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
COMMON COMMUNICATION FORMAT (CCF). Dr.S. Surdarshan Rao Professor Dept. of Library & Information Science Osmania University Hyderbad
11 Researcher practice in data management Margaret Henty.
Managing Digital Assets File Naming and Resizing.
Sharing Research Data with: OC Data Portal: ocdp.lib.uci.edu UC Irvine Dash: dash.lib.uci.edu Dan Tsang, Data Librarian Julia Gelfand, Applied Sciences.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. DATABASE.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
Faculty of Education, Language and Community Services Stavroula Tsembas Marketing and Distribution: Metadata Linkages What is metadata? information about.
Short-term storage and data documentation Mari Wigham COMMIT/
Digital Stewardship Lee Dotson Digital Initiatives Librarian University of Central Florida John C. Hitt Library Presentation available at
Automating the Audit: Updates from the Metadata Upgrade Project at the University of Houston Libraries Andrew Weidner, Metadata Librarian Santi Thompson,
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Documenting and organising your data For an easier life lib.uts.edu.au utslibrary.
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
Do-more Technical Training
An Introduction to Tessella and The Safety Deposit Box Platform
Institutional role in supporting open access, open science, open data
Data Management: Documentation & Metadata
Digital Project Lifecycle Curating Across the Curriculum
Chapter 9 Database and Information Management.
Storage Basic recommendations:
RDM Definition Research data management (RDM) encompasses the control of data inputs, the use of data, the protection of data, and the creation of data.
Presentation transcript:

Organising and Documenting Data Stuart Macdonald EDINA & Data Library DIY Research Data Management Training Kit for Librarians

Organising your data RDM is one of the essential areas of responsible conduct of research. Research data files and folders need to be organised in a systematic way to be: identifiable and accessible for yourself, identifiable and accessible for colleagues, and for future users. Thus it is important to plan the organisation of your data before a research project begins. Doing so will prevent any confusion while research is underway or when multiple individuals will be editing and / or analysing the data.

This can be achieved through: Directory structure & file naming conventions (File naming conventions for specific disciplines) File renaming File version control For this to be successful a consistent and disciplined approach is required. Easier to accomplish as and when data files are generated rather than retrospectively attempting to implement. When organization methods become too time consuming, consider automated methods.

File Naming conventions Naming datasets according to agreed conventions should make file naming easier for colleagues because they will not have to re-think the process each time. File names should provide context for the contents of the file, making it distinguishable from files with similar subjects or different versions of the same file. Many files are used independently of their file or directory structure, so provide sufficient description in the file name. Suggested strategies: identify the project; avoid special characters; use underscores rather than spaces; include date of creation or modification in a standard format (e.g. YYYY_MM_DD or YYYYMMDD): use project number Be consistent! Avoid being cryptic!

Batch (or bulk) renaming Software tools exist that can organise data files and folders in a consistent and automated way through batch renaming. There are many situations where batch renaming may be useful, such as: – where images from digital cameras are automatically assigned filenames consisting of sequential numbers – where proprietary software or instrumentation generate crude, default or multiple filenames – where files are transferred from a system that supports spaces and/or non-English characters in filenames to one that doesn't (or vice versa). Batch renaming software can be used to substitute such characters with acceptable ones.

Benefits of consistent data file labelling are: Data files are not accidentally overwritten or deleted Data files are distinguishable from each other within their containing folder Data file naming prevents confusion when multiple people are working on shared files Data files are easier to locate and browse Data files can be retrieved both by creator and by other users Data files can be sorted in logical sequence Different versions of data files can be identified If data files are moved to other storage platform their names will retain useful context

Version Control It is important to consistently identify and distinguish versions of data files. This ensures that a clear audit trail exists for tracking the development of a data file and identifying earlier versions especially if data is frequently updated by multiple users. Suggested strategies: Use a sequential numbered system: v1, v2, v3, etc. Don't use confusing labels: revision, final, final2, etc. Record all changes -- no matter how small Discard obsolete versions (but never the raw copy) Use auto-backup instead of self-archiving, if possible The alternative is to use version control software. (Bazaar, TortoiseSVN, SubVersion)

Documenting Data There are many reasons why you need to document your data: To help you remember the details later To help others understand your research Verify your findings Replicate your results Archive your data for access and re-use Some examples of data documentation are: Laboratory notebooks Field notes Questionnaires SOPs Methodologies

Documenting Data Laboratory or field notebooks, for example play an important role in supporting claims relating to intellectual property developed by University researchers, and even defending claims against scientific fraud. Research data need to be documented at various levels: Project level study background, methodologies, instruments, research hypothesis File or database level formats, relationships between files Variable or item level How variable was generated & label descriptions

The difference between documentation and metadata is that the first is meant to be read by humans and the second implies computer-processing (though may also be human-readable) to assist location and access to data through search interfaces. Three broad categories of metadata are: Descriptive - common fields such as title, author, abstract, keywords which help users to discover online sources through searching and browsing e.g. DC, MARC Administrative - preservation, rights management, and technical metadata about formats. Structural - how different components of a set of associated data relate to one another, such as a schema describing relations between tables in a database. Metadata – data about data

Need for metadata Public Research Community Project Researcher Metadata may not be required if you are working alone on your own computer, but become crucial when data are shared online. Metadata help to place your dataset in a broader context, allowing those outside your institution, discipline, or research environment to understand how to interpret your data.

THANK YOU!