Research Data Management: Store and Analyze

Slides:



Advertisements
Similar presentations
Data Management Tools David Wallom. YOUR DATA DOES NOT BELONG TO YOU! IT BELONGS TO YOUR EMPLOYING INSTITUTION!
Advertisements

Organise Workplace Information
P20 Seminar November 12, Statistical Collaboration Part 1: Working with Statisticians from Start to Finish Part 2: Essentials of Data Management.
Thoughts on Technology Issues for Small Business Implementing Technical Safeguards to support Your Policies.
Data Storage and Security Best Practices for storing and securing your data The goal of data storage is to ensure that your research data are in a safe.
Computer File Management One of the most important aspects of our job is information maintenance.
Developing a Records & Information Retention & Disposition Program:
DATA SECURITY Social Security Numbers, Credit Card Numbers, Bank Account Numbers, Personal Health Information, Student and/or Staff Personal Information,
Data Preservation Best Practices for preserving your research data for future reuse The goal of data preservation is to ensure that your data is in a sustainable.
Information Security Decision- Making Tool What kind of data do I have and how do I protect it appropriately? Continue Information Security decision making.
Data quality control, Data formats and preservation, Versioning and authenticity, Data storage Managing research data well workshop London, 30 June 2009.
Records Survey and Retention Schedule Recertification 2011.
15 Maintaining a Web Site Section 15.1 Identify Webmastering tasks Identify Web server maintenance techniques Describe the importance of backups Section.
Section 15.1 Identify Webmastering tasks Identify Web server maintenance techniques Describe the importance of backups Section 15.2 Identify guidelines.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Disaster planning and management Small public offices information briefing December 2004.
Digital Preservation: Store & Protect Laurie Sauer Information Technologies Librarian Knox College
Ecords Management Records Management Paul Smallcombe Records & Information Compliance Manager.
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
MATSEC Past Papers May 2010 Paper 1 Paper 2A. What is the difference between each of the following pairs of items? Syntax Error Caused by forgetting certain.
Curriculum Forms Guidelines and Information Curriculum Proposals for publication in the Calendar, deadline is March 31, 2015.
Managing Your Data: Backing Up Your Data Robert Cook Oak Ridge National Laboratory Section: Local Data Management Version 1.0 October 2012.
Component 8/Unit 9bHealth IT Workforce Curriculum Version 1.0 Fall Installation and Maintenance of Health IT Systems Unit 9b Creating Fault Tolerant.
How Not to Lose Track of Your Research Organization and Planning Resources at Brandeis Melanie Radik and Raphael Fennimore Library & Technology Services.
ISO/IEC 27001:2013 Annex A.8 Asset management
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Managing Data & Information Procedures & Techniques.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Digital Stewardship Lee Dotson Digital Initiatives Librarian University of Central Florida John C. Hitt Library Presentation available at
UNDERSTANDING INFORMATION MANAGEMENT (IM) WITHIN THE FEDERAL GOVERNMENT.
Writing a successful data management plan Kathleen Fear October 17, 2013.
( ) 1 Chapter # 8 How Data is stored DATABASE.
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
BUS 308 Week 3 Assignment Evaluation of Correlations Check this A+ tutorial guideline at Assignment-Evaluation-of-Correlations.
Science Fair Information Night
Information for Students and Parent(s)/Guardian(s)
Slide Template for Module 4 Data Storage, Backup, and Security
Science Fair Information.
Digital Stewardship Curriculum
Slide Template for Module 2: Types, Formats, and Stages of Data
Mason County Schools August 11, 2016
Fording the Data Stream MLA Midwest Chapter 2013
Change the World, Start with ENERGY STAR®
Research Data Management: an Introduction Jane Fry July 27, 2017
Handling Personal Data
We Have to Pack What?! Things to Consider When Moving Your Collections
Naming and Saving Files
Configuration Management and Prince2
Mrs. DeVaults Note Taking Instruction
Resource Allocation Strategy & Planning August 2017
Section 15.1 Section 15.2 Identify Webmastering tasks
Retain Data Commensurate with Value
Product Retrieval Statistics Canada / Statistique Canada Title page
BUS 519 Education for Service-- snaptutorial.com.
BUS 519 Teaching Effectively-- snaptutorial.com
All data occupies physical space, even if we don't think of it as such.
Curriculum Forms Guidelines and Information
Digital Project Lifecycle Curating Across the Curriculum
Chapter 3 The DATA DIVISION.
Agriculture Labor Law Jeopardy Master Round 2
BASIC IRRS TRAINING Lecture 14
Storage Basic recommendations:
Computer Terms Good to Know.
Yellow Files for ELL Service: NLPS
Backup and restoration of data, redundancy
Mysale Information Classification 101
Research Data Management
Mason County Schools August 11, 2016
Curriculum Forms Guidelines and Information
SDLC Phases Systems Design.
Presentation transcript:

Research Data Management: Store and Analyze October 12, 2017 A. Michelle Edwards, oac Carol perry, Library

RDM 1 - Data Storage and Analysis 2017 Objectives Establish sound variable naming and labelling conventions Identify sound storage options Recognize the importance of data backups RDM 1 - Data Storage and Analysis 2017

Research life cycle RESEARCH PROJECT Create Process Analyse Report Preserve Share/ Reuse This graphic of the research life cycle is pretty well accepted across disciplines and stakeholder groups as an accurate representation of a research project cycle. But every project would benefit from starting out with a data management plan….. A written outline of how you will handle your data throughout the project life cycle. RDM 1 - Data Storage and Analysis 2017

Research life cycle RESEARCH PROJECT PLAN Create Process Analyse Report Preserve Share/ Reuse This graphic of the research life cycle is pretty well accepted across disciplines and stakeholder groups as an accurate representation of a research project cycle. But every project would benefit from starting out with a data management plan….. A written outline of how you will handle your data throughout the project life cycle. RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 RDM Review… A sound strategy and best practices used to……… Organize Document Store Analyze Secure Preserve/Share/Reuse ………Your data RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Create a plan Elements: Collecting/creating data Documenting your work Managing your files Storing, securing & backing up files Preservation Access, sharing & reuse These are the main elements used to construct a data management plan. Use this in the context of the research lifecycle from the previous slide Determine how this all fits together RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Setting the stage RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Discussion What do you need to think about when collecting data? File structure Comparisons across years Changes to variables RDM 1 - Data Storage and Analysis 2017

The French Blast Research Group received an NSERC grant in 2016-2017 to continue an ongoing feeding trial. The research involves three horses from a number of different farms, where each horse is weighed(kg) at the beginning of the trial, placed on one of three feed regimens (hay, pasture, or silage), their total feed intake(kg) is measured for 2 consecutive weeks, and their weights (kg) are taken at the same time. Additional information that is gathered includes: the name of the horse’s owner, the annual income of the owner, and feed costs of the owner for their horses. You have been hired by the French Blast Research Group to manage their data collections. They have 5 years of horse feeding trial data that was collected from a number of horse farms across Ontario since 2011. You have been hired to review, manage, and preserve their research trial data according to best practices and requirements of the NSERC grant. Workshop Scenario RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Exercise 1 Work in groups of 2 or 3 Review your job description and the data that has been collected to date for 2017. Please list any other pieces of data that you would like to collect and explain why. How would you recommend that the data be collected? Paper, laptop, hand-held device, etc…? RDM 1 - Data Storage and Analysis 2017

Review Did you encounter any problems? Each group can come up to the board and outline the directory structure their group devised. What did you consider in developing the structure? RDM 1 - Data Storage and Analysis 2017

Commonly Used Statistical Packages SAS SPSS Stata R Matlab RDM 1 - Data Storage and Analysis 2017

Variable Name Restrictions and Limits Length of the Variable Name 1st Character of the Variable Name SAS: 32 characters long SAS: 1st character MUST be a letter or an underscore Stata: 32 characters long Stata: 1st character MUST be a letter or an underscore Matlab: 32 characters long Matlab: 1st character MUST be a letter SPSS : 64 bytes long SPSS: 1st character MUST be a letter, an underscore, or @, #, $ 64 characters in English 32 characters in Chinese R: 10,000 characters long R: No restrictions found RDM 1 - Data Storage and Analysis 2017

Variable Name Restrictions and Limits Special Characters in Variable Names Case in Variable Names SAS: NONE SAS: Mixed case – Presentation only Stata: NONE Stata: Mixed case – Presentation only Matlab: No restrictions found Matlab: Case sensitive SPSS : NONE except SPSS: Mixed case – Presentation only Period, @ R: NONE except R: Mixed case – Presentation only Period NO BLANKS (SPACES) allowed in any of the Statistical Packages Beware of Function names in all Statistical Packages – these cannot be used as Variable Names RDM 1 - Data Storage and Analysis 2017

Best Practices for Variable Names Set Maximum length to 32 characters ALWAYS start variable names with a letter Numbers can be used anywhere in the variable name AFTER the first character ONLY use underscores “_” in a variable name Do NOT use blanks or spaces Use lowercase RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Exercise 2 Work in groups of 2 or 3 The first 3 observations for 2017 has been collected and provided to you on a slip of paper. Create variable names for ALL the pieces of information that has been collected Create labels/definitions for the variable names – type these out in a new worksheet in Excel Create a data worksheet for your data. Save the file, using the naming conventions you created in the 1st workshop (October 5, 2017) Select someone from your group to write out your variable names on the front board when you are finished. RDM 1 - Data Storage and Analysis 2017

Review Did you encounter any problems? RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Storage and security How will you manage data security? What are your responsibilities? Are you aware of the university’s Guidelines on Categorization and Security of Research Data and Information? https://www.uoguelph.ca/ccs/sites/uoguelph.ca.ccs/files/Categorization%20%26%20Security%2 0of%20Research%20Data_Final.pdf The university’s guidelines assess the level of sensitivity of the data you are collecting against the risk or magnitude of harm which would result from the loss of the data. A chart outlines what types of security measures should be employed to protect the data depending on this cross-referencing. RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 How will your data be backed up? How often will it be backed up? How much space do you need for your data? How long will backup copies be retained? If you are using services such as Dropbox, Figshare etc. your data are located outside Canada. If you have sensitive data, you may not be adequately protecting that data. RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 3-2-1 storage backup rule Keep at least three copies of your data Store the copies on two different media Keep one backup copy offsite Keep a ‘master file’ – original untouched – for emergencies RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Storage & Security Backup files on regular basis Keep in separate location Create ‘master files’ for raw data files & important document files Store in physically separate, secure location (preferably on a networked drive) If working files are lost – make a copy from the ‘master files’ Synchronize files Encrypt sensitive data – see CCS encryption services RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Security RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Conclusion RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Contact information: Michelle Edwards OAC Statistics Consultant oacstats@uoguelph.ca http://www.uoguelph.ca/oac/stats Carol Perry Associate Librarian, Research & Scholarship lib.research@uoguelph.ca RDM 1 - Data Storage and Analysis 2017

RDM 1 - Data Storage and Analysis 2017 Photo credits All phots used under CC BY – SA 2.0 licenses https://www.flickr.com/photos/smerikal/5532818874/in/photostream/ - title image2 https://upload.wikimedia.org/wikipedia/commons/9/93/Cutting_silage_at_Folsetter_1.jpg - title image1 https://www.flickr.com/photos/91035846@N05/9437703460/ RDM 1 - Data Storage and Analysis 2017