Data Management for Geoinformatics A short course on good data management for taught postgraduate students in geoinformatics and related data sciences.

Slides:



Advertisements
Similar presentations
Organising and Documenting Data Stuart Macdonald EDINA & Data Library DIY Research Data Management Training Kit for Librarians.
Advertisements

Data Management Tools David Wallom. YOUR DATA DOES NOT BELONG TO YOU! IT BELONGS TO YOUR EMPLOYING INSTITUTION!
Computer and Mobile Device Equipment Security Brief May 29, 2008 Presented by: Kevin G. Sutton, Chief, Information Technology Unit.
GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor.
Backup Strategy. An Exam question will ask you to describe a backup strategy. Be able to explain: Safe, secure place in different location. Why? – For.
Data Storage and Security Best Practices for storing and securing your data The goal of data storage is to ensure that your research data are in a safe.
A guide for postgraduate students Presenter Alison Baker - Trainer, Information Technology Services File Management using Windows 7.
The Ultimate Backup Solution.
Evidor: The Evidence Collector Software using for: Software for lawyers, law firms, corporate law and IT security departments, licensed investigators,
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Open Exeter Project Team
®® Microsoft Windows 7 for Power Users Tutorial 10 Backing Up and Restoring Files.
Guide to Linux Installation and Administration, 2e1 Chapter 13 Backing Up System Data.
STORING YOUR DATA ……………………………………………………………………………………………………………………………….…………………………….. ……………………………………………………………......…... RESEARCH DATA MANAGEMENT TEAM UK DATA.
Chapter 4: Operating Systems and File Management 1 Operating Systems and File Management Chapter 4.
Chapter 5: System Software: Operating Systems and Utility Programs.
New Data Regulation Law 201 CMR TJX Video.
Backup Strategy. Backup strategy Backup copy is a second copy saved to another location, usually on a backup device e.g. USB stick.Backup copy is a second.
Computer Lab Teachers are welcome to change or add slides within this presentation to suit the needs of their students or better accommodate the structure.
1.1 System Performance Security Module 1 Version 5.
How to Organise your Files and Folders Gareth Cole. Data Curation Officer. 6 th October 2014.
Purpose Intended Audience and Presenter Contents Proposed Presentation Length Intended audience is all distributor partners and VARs Content may be customized.
Chapter 18: Windows Server 2008 R2 and Active Directory Backup and Maintenance BAI617.
Module 7. Data Backups  Definitions: Protection vs. Backups vs. Archiving  Why plan for and execute data backups?  Considerations  Issues/Concerns.
1 Maintain System Integrity Maintain Equipment and Consumables ICAS2017B_ICAU2007B Using Computer Operating system ICAU2231B Caring for Technology Backup.
Component 4: Introduction to Information and Computer Science Unit 4: Application and System Software Lecture 3 This material was developed by Oregon Health.
Guide to Computer Forensics and Investigations Fourth Edition
Managing Your Data: Backing Up Your Data Robert Cook Oak Ridge National Laboratory Section: Local Data Management Version 1.0 October 2012.
Configuring Data Protection Chapter 12 powered by dj.
E.Soundararajan R.Baskaran & M.Sai Baba Indira Gandhi Centre for Atomic Research, Kalpakkam.
Data Management for Geoinformatics A short course on good data management for taught postgraduate students in geoinformatics and related data sciences.
IT1001 – Personal Computer Hardware & system Operations Week7- Introduction to backup & restore tools Introduction to user account with access rights.
The Online World ONLINE DOCUMENTS. Online documents Online documents (such as text documents, spreadsheets, presentations, graphics and forms) are any.
FILE MANAGEMENT Computer Basics 1.3. FILE EXTENSIONS.txt.pdf.jpg.bmp.png.zip.wav.mp3.doc.docx.xls.xlsx.ppt.pptx.accdb.
Managing Data & Information Procedures & Techniques.
Objectives  Legislation:  Understand that implementation of legislation will impact on procedures within an organisation.  Describe.
Candidates should be able to:  describe the purpose and use of common utility programs for:  computer security (antivirus, spyware protection and firewalls)
Digital Stewardship Lee Dotson Digital Initiatives Librarian University of Central Florida John C. Hitt Library Presentation available at
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
BACKUP AND RESTORE. The main area to be consider when designing a backup strategy Which information should be backed up Which technology should be backed.
Handling Personal Data & Security of Information Paula Trim, Information Officer, Children’s Strategic Services, Mon – Thurs 9:15-2:15.
WHAT ARE BACKUPS? Backups are the last line of defense against hardware failure, floods or fires the damage caused by a security breach or just accidental.
Chapter 6 Protecting Your Files
File-System Management
Unit 4 – Technology literacy
Slide Template for Module 4 Data Storage, Backup, and Security
Basic Guide to Computer Backups
Technology Skills for Life, Career, and Academic Success
Open Exeter Project Team
Digital Stewardship Curriculum
Maintaining Windows Server 2008 File Services
Store it safely You’ll be aware of the importance of backing up the files on your computer. But are you aware of some of the key things you need to consider.
Microsoft Windows 7 - Illustrated
Advanced Technology Skills
Understanding File Management
Understanding File Management
File Management.
LO2: Understand Computer Software
The Ultimate Backup Solution.
Digital Project Lifecycle Curating Across the Curriculum
Storage Basic recommendations:
Backup and restoration of data, redundancy
Research Data Management
1.2 Types of information storage media
Title: File Management Learning Intentions
Have you seen this screen?
BTEC level 3 Learning Aim D.
Software - Operating Systems
Microsoft Office Illustrated Fundamentals
Data Recovery: Why Secure Deletion is so Important.
Presentation transcript:

Data Management for Geoinformatics A short course on good data management for taught postgraduate students in geoinformatics and related data sciences. John Murtagh, UEL

Data Management

What is research data management? Looking after data throughout the data lifecycle (from conception to destruction) Good documentation and record-keeping Transfer of responsibility after project ends Keeping safe and possibly confidential Access, preservation and re-use Destruction “It’s just good research”

Preparing your data The following slides are taken from the Research Data MANTRA online course by Data Library and EDINA, University of Edinburgh & is licensed under a Creative Commons Attribution 2.5 UK: Scotland License.

The benefits of consistent data file labelling: Data files are distinguishable from each other within their containing folder Data file naming prevents confusion when multiple people are working on shared files Data files are easier to locate and browse Data files can be retrieved not only by the creator but by other users Research data files and folders need to be labelled and organised in a systematic way so that they are both identifiable and accessible for current and future users. File labelling

Data files can be sorted in logical sequence Data files are not accidentally overwritten or deleted Different versions of data files can be identified If data files are moved to other storage platform their names will retain useful context

3. Consistency - choose a naming convention and ensure that the rules are followed systematically by always including the same information (such as date and time) in the same order (e.g. YYYYMMDD) 1. Organisation - important for future access and retrieval 2. Context - this could include content specific or descriptive information independent of where the data is stored There are three main criteria to consider regarding the naming and labelling of research data files, namely:

The following video is from a talk given by Dave Anderson from the National Oceanic and Atmospheric Administration's (NOAA) National Climatic Data Center at the Data Management workshop sponsored by the Earth Science Information Partners (ESIP).  It highlights some of the research data organisation issues such as proprietary formats, cryptic labelling and vague filenames.

Windows: Ant Renamer (http://www. antp Windows: Ant Renamer (http://www.antp.be/software/renamer) RenameIT (http://www.bulkrenameutility.co.uk/) Mac: Renamer4Mac (http://renamer4mac.com/) Name Changer (http://web.mac.com/mickeyroberson/MRR_Software/NameChanger.html) Linux: GNOME Commander (http://www.nongnu.org/gcmd/) GPRename (http://gprename.sourceforge.net/) Unix The use of the grep command to search for regular expressions If you need to rename data file names in bulk there are a number of tools available. Here are some examples for different operating systems:

Backing up & storing your data

Data loss will happen to you Dropping your laptop Hard drive failures are updates Obsolescence/upgrades Poorly described data (metadata) Theft of equipment People move on Research trends (follow the money consequences) Overwriting data/versioning File formats Media degradation (CDR’s, memory sticks, SSD’s) Slide from Data Management Planning and Storage for Psychology (DMSPpsych) The University of Sheffield 18/09/2018

Research data loss – read this article! December 2012 The laptop was left by a graduate student in the backseat of a car parked outside a downtown restaurant Someone broke in to the car and stole the computer Trophic ecologist contained a vast amount of experimental data from tracked fish (cost $50,000 CND) “Unfortunately none of the data had been backed up yet.If we don’t get this laptop back, that data is lost forever.”

HOWEVER You can prevent total loss of your data by backing up. It is recommended that you keep at least 3 copies of your data. For example, original, external (locally), and external (remotely), and have a policy for maintaining regular backups.

A guide to backing up your data

Questions to ask yourself

How will I back up my data? How regularly will backups be made? Will all data, or only changed data, will be backed up? (A backup of changed data is known as an "incremental backup", while a backup of all data is known as a "full backup"). How often full and incremental backups will be made? How long will backups be stored?

How much hard drive space or number of Digital Video Discs (DVDs) will I require to maintain this backup schedule? If the data is sensitive, how will they be secured and (possibly) destroyed? What backup services are available that meet these needs and, if none, what will be done about it? Who will be responsible for ensuring backups are available?

In the following video Professor Lynn Jamieson from the University of Edinburgh talks about the importance of keeping regular backups of research data. 

Storing it in the Cloud

“Cloud storage is a model of networked enterprise storage where data is stored not only in the user's computer, but in virtualized pools of storage which are generally hosted by third parties, too.” http://en.wikipedia.org/wiki/Cloud_storage

Fortunately….26 Online Backup Services have been reviewed Cloud services Fortunately….26 Online Backup Services have been reviewed

The University of Hertfordshire has reviewed the most popular cloud storage services…. It has also analysed the pros and cons of their data and security policies as well as their costs and access. You can read it here: http://sitem.herts.ac.uk/rdm/files/Cloud_Storage_Review_v.1.2.pdf

Cloud Storage: Advantages and Disadvantages The following slide is taken from the Research Data MANTRA online course by Data Library and EDINA, University of Edinburgh & is licensed under a Creative Commons Attribution 2.5 UK: Scotland License.

Advantages No user intervention is required (change tapes, label CDs, perform manual tasks). Remote backup maintains data offsite. Most provide versioning and encryption. They are multi-platform. Disadvantages Restoration of data may be slow (dependent upon network bandwidth). Stored data may not be entirely private (thus pre- encryption). Service provider may go out of business. Protracted intellectual property rights/copyright/data protection licences.

Access control

Data security is the means of ensuring that research data is kept safe from corruption and that access is suitably controlled. It is important to consider the security of your data to prevent: Accidental or malicious damage/modification to data. Theft of valuable data. Breach of confidentiality agreements & privacy laws. Premature release of data, which can void intellectual property claims. Release before data have been checked for accuracy and authenticity.

Access control You need to consider the following questions for securing your research data How will you manage access arrangements and data security? How will you enforce permissions, restrictions and embargoes? Other security issues such as sensitive data, off-network storage, storage on mobile devices (laptops, smartphones, flash drives, etc), policy on making copies of data, etc. where relevant.

Encryption There are a number of ways to encrypt your data where it is stored. There are many software programs which allow you to do this easily and are also for free. See the following Wikipedia page: Comparison of disk encryption software

Encryption - TrueCrypt One of the most popular encryption tools is TrueCrypt. You can see why…

Other sessions as part of Data Management in Geoinformatics: Data Collection Data Integration Data Sharing Data Management for Geoinformatics by John Murtagh as part of the Jisc funded project TraD (University of East London is licensed under a Creative Commons Attribution Share Alike Licence