Slide Template for Module 2: Types, Formats, and Stages of Data

Slides:



Advertisements
Similar presentations
Literature Review and Parts of Proposal
Advertisements

Where did plants and animals come from? How did I come to be?
UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details October - 13 Dawn Gross, Zac Painter, Liz Winiarz.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Scientific Methods and Terminology. Scientific methods are The most reliable means to ensure that experiments produce reliable information in response.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
How Not to Lose Track of Your Research Organization and Planning Resources at Brandeis Melanie Radik and Raphael Fennimore Library & Technology Services.
Chapter 1 Background 1. In this lecture, you will find answers to these questions Computers store and transmit information using digital data. What exactly.
Lesson Overview Lesson Overview What Is Science?.
 First thing that the reader will see and this will often determine whether they will read on  Capture their attention, so the title needs to succinctly.
Introduction to RDM Sarah Jones & Joy Davidson Digital Curation Centre
Slide Template for Module 2: Types, Formats, and Stages of Data.
Virtual Fair Template Go through each page and replace the text with your own project information. Each slide gives instruction about how many words can.
Introduction to Managing Research and Personal Data.
Quantitative Methods for Business Studies
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
Defining the research problem
Principles of Marketing - UNBSJ
Resource: Text Chapter 2
GCSE ICT Data Transfer.
3.3 Fundamentals of data representation
AP CSP: Cleaning Data & Creating Summary Tables
Overview of probability and statistics
Digital Stewardship Curriculum
Craig T. Gabler KDSL Global Science Specialist
Core Competencies: Choosing Study Design
How Not to Lose Track of Your Research
Module 2: Types, Formats, and Stages of Data
How NOT to share your data: Avoiding data horror stories
Multimedia: Digitised Sound Data
THE SCIENTIFIC METHOD   Regents Biology.
Research Process №5.
ISPM 8: Pest Status of on Area
Chapter 2 Sociological Research Methods
Lesson Overview 1.1 What Is Science?.
How to Structure a Geofluids Presentation
SCIENCE FAIR Mini-Lesson #5
                      Digital Audio 1.
4.1.
1 Chapter.
Conducting Research in the Social Sciences
Chapter Eight: Quantitative Methods
Introduction and Literature Review
Quantitative and Qualitative Data
Introduction to Research Data Management
What Is Science? Read the lesson title aloud to students.
What Is Science? Read the lesson title aloud to students.
Virtual Fair Template Go through each page and replace the text with your own project information. Each slide gives instruction about how many words can.
Scientific Method Part 2.
Lesson Overview 1.1 What Is Science?.
Research Seminar Session 7 Presenting a Research proposal By: Dr
What Is Science? Read the lesson title aloud to students.
Explaining the Methodology : steps to take and content to include
Writing reports Wrea Mohammed
Scientific Method Part 2.
Lesson Overview 1.1 What Is Science?.
Lesson Overview 1.1 What Is Science?.
Chapter 1 Section 2 How Scientists Work
Lesson Overview 1.1 What Is Science?.
What Is Science? Read the lesson title aloud to students.
Lesson Overview 1.1 What Is Science?.
Research Problem: The research problem starts with clearly identifying the problem you want to study and considering what possible methods will affect.
Lesson Overview 1.1 What Is Science?.
Gathering and Organizing Data
Lesson Overview 1.1 What Is Science?.
Mukurtu: Batch Upload, Roundtrip
Lesson Overview 1.1 What Is Science?.
Title Page Name of Participants.
Chapter 1 The Science of Biology
Lesson Overview 1.1 What Is Science?.
Presentation transcript:

Slide Template for Module 2: Types, Formats, and Stages of Data

Module 2: Data Types, Stages & Formats Learner Objectives 1. Explain what research data is and the range of data types 2. Identify stages of research data 3. Identify common potential storage formats for data 4. Identify relevant quality control techniques/technical standards 5. Identify methods of recording data that are specific to researchers’ disciplines and research interests Module 2: Data Types, Stages & Formats

Research Data Associated with Most Disciplines Images Video Mapping/GIS data Numerical measurements Module 2: Data Types, Stages & Formats

Research Data Associated with Social Sciences survey responses focus group and individual interviews economic indicators demographics opinion polling Module 2: Data Types, Stages & Formats

Research Data Associated with Hard Sciences measurements generated by sensors/laboratory instruments computer modeling simulations observations and/or field studies specimen Of course, research data is not limited just to these two broad disciplinary areas. For example, digital humanities scholars may generate research data by text mining historical documents, analyzing the prevalence of particular words.   While many types of data are generated in the course of research, existing data can also be used in research projects. Examples include social scientists using census data for their studies, or scientists comparing published gene sequences from various organisms. Module 2: Data Types, Stages & Formats

Stages of Data Related to Research Data Life Cycle Raw Data Processed Data Analyzed Data Finalized/Published Data Existing Data across Different Sources Raw Data: What is being measured or observed? The data being generated during the research project. Processed Data: Making the raw data useful/manipulable Analyzed Data: Manipulated/interpreted data. What does the data tell us? Is it significant? How so? Finalized/Published Data: How do the data support your research question? Existing Data across Different Sources: e.g. GIS data Module 2: Data Types, Stages & Formats

Stages of Data Related to Research Data Life Cycle Sample hypothesis: Water temperatures in Lake Superior are now significantly warmer than in previous years. The evidence lends support to global warming. That’s a bit abstract – let’s try illustrating those stages with a sample research hypothesis: Water temperatures in Lake Superior are now significantly warmer than in previous years. This evidence lends support to global warming. Module 2: Data Types, Stages & Formats

Using our sample hypothesis Water temperatures in Lake Superior are now significantly warmer than in previous years. This evidence lends support to global warming. Raw Data = daily lake temperatures Processed Data = ‘cleaned’ temp. data in spreadsheet Analyzed Data = average temps., graphing changes Finalized Data = does data support the hypothesis? Raw data – What is being measured or observed? This is the data that is being generated during the research project. An example of raw data in our hypothetical research question might be daily measurements of temperature in Lake Superior. Raw data coincides with the ‘creating data’ section of the research data lifecycle. Processed data – How can the raw data be made useful/manipulable? To continue with the example above, lake temperature data may become processed once researchers ‘clean it’ by removing clearly erroneous temperature measurements from the data set, and enter the remaining temperatures into a spreadsheet for manipulation and analysis. Processed data coincides with the ‘processing data’ section of the research data lifecycle. Analyzed data – What does the data tell us? Is it significant? How so? For example, daily lake temperature data could be analyzed by finding average temperatures, looking at seasonal fluctuations, and generating graphs that demonstrate these changes. Analyzed data coincides with the ‘analysing data’ section of the research data lifecycle. Finalized/published data – How does the data support your research question? For example, a plot of our average lake temperatures for 2013 may show statistically significant differences when compared to the same data from 1913 and 1963. Finalized/published data coincides with a few sections of the research data lifecycle including preserving, giving access to, and re-using data. Note that our lake temperature scenario is a very simplified example Module 2: Data Types, Stages & Formats

Preferable Format Types for Long-Term Access to Data Data formats that offer the best chance for long-term access are both: Non-proprietary (also known as open), and Unencrypted and uncompressed Non-proprietary or Open Formats are Readable by more than just the equipment and/or program that generated it. Sometimes proprietary file formats are unavoidable. However, proprietary formats can often be converted to open formats. Unencrypted / uncompressed Formats: Offer the best prospects for long-term access. If files are encrypted and/or compressed, the method used will need to be both discoverable and usable for file access in the future. Module 2: Data Types, Stages & Formats

Module 2: Data Types, Stages & Formats Preferred Formats Examples of preferred formats for various data types include: Moving Images: MOV, MPEG Audio: WAVE, MP3 Numbers/statistics: ASCII, SAS Images: TIFF, JPEG 2000 Text: PDF/A, ASCII For these and more, see: https://lib.stanford.edu/data-management-services/file-formats http://www.digitalpreservation.gov/formats/fdd/browse_list.shtml Module 2: Data Types, Stages & Formats

Converting to Preferable Formats Information can be lost when converting file formats. To mitigate the risk of lost information: Note conversion steps taken If possible, keep the original file as well as the converted one In file format conversion, information can be lost eg, converting from tiff to jpeg Carefully note the steps taken during the conversion/migration A best practice, whenever possible, is to keep the original file as well as the converted one Module 2: Data Types, Stages & Formats

Describing Data, Documenting Reliability & Collection Techniques Data documentation explains the: Who What Where When And why of data. Who collected this data? Who/what were the subjects under study? What was collected, and for what purpose? What is the content/structure of the data? Where was this data collected? What were the experimental conditions? When was this data collected? Is it part of a series, or ongoing experiment? Why was this experiment performed? Module 2: Data Types, Stages & Formats

Describing Data, Documenting Reliability & Collection Techniques Who: Who collected this data? Who or what were the subjects under study? Module 2: Data Types, Stages & Formats

Describing Data, Documenting Reliability & Collection Techniques What: What data was collected, and for what purpose? What is the content and structure of the data? Module 2: Data Types, Stages & Formats

Describing Data, Documenting Reliability & Collection Techniques Where: Where was this data collected? What were the experimental conditions that produced it? Why was this experiment performed? Module 2: Data Types, Stages & Formats

Describing Data, Documenting Reliability & Collection Techniques When: When was the data collected? Is the data part of a series, or ongoing experiment? Module 2: Data Types, Stages & Formats

Describing Data, Documenting Reliability & Collection Techniques Why: Why was this experiment performed? How does it relate to your research question? Module 2: Data Types, Stages & Formats

Cross Discipline Concerns No matter what, you need to have: File naming conventions Version control Module 2: Data Types, Stages & Formats

Why Do I Need to Worry about That? Consider this: If you unexpectedly have to leave your research project for a few months, could a colleague easily make sense of your data files? Module 2: Data Types, Stages & Formats

Why Use File Naming Conventions? Naming conventions make life easier! Help you find your data Help others find your data Help track which version of a file is most current Module 2: Data Types, Stages & Formats

What File Naming Convention Should I Use? Has your research group established a convention? If not, general guidelines include: Meaningful file names that aren’t too long Avoid certain characters Dates can help with sorting and version control Ultimately, the best file naming convention is one that works for you and your research project. Here are some points to consider: Generally speaking, a good protocol is to give files meaningful, descriptive names, while avoiding certain problematic characters (such as spaces, slashes, colons, periods, ampersands). They behave differently in different operating systems Overly long file names (>25 characters) should also be avoided, as these may not cooperate well with different operating systems. Consider, too, strategies you might use for version control; how will you keep track of which version of a file or document is the most current? Version control can be particularly challenging for collaborative projects in which several people may have access to, or even the ability to change, a file. Module 2: Data Types, Stages & Formats