Data types and representation

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Statistics for the Social Sciences Psychology 340 Fall 2006 Distributions.
Displaying Data Objectives: Students should know the typical graphical displays for the different types of variables. Students should understand how frequency.
1 Practical Psychology 1 Week 5 Relative frequency, introduction to probability.
TYPES OF DATA. Qualitative vs. Quantitative Data A qualitative variable is one in which the “true” or naturally occurring levels or categories taken by.
Slide 1 Spring, 2005 by Dr. Lianfen Qian Lecture 2 Describing and Visualizing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data.
QUANTITATIVE DATA ANALYSIS
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
 Raw data is generated by the process of collecting information  From 20-question survey of 100 people, for example, 2000 ‘bits’ of information are.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
The Stats Unit.
Statistical Analysis I have all this data. Now what does it mean?
Southampton Education School Southampton Education School Dissertation Studies Quantitative Data Analysis.
Data Presentation.
Displaying Data Visually
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
Introduction to Statistics What is Statistics? : Statistics is the sciences of conducting studies to collect, organize, summarize, analyze, and draw conclusions.
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
Data types and representation Two types of data: Observations are a finite, countable, number of values Observations are a finite, countable, number of.
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
Chapter 2 Frequency Distributions
1 Concepts of Variables Greg C Elvers, Ph.D.. 2 Levels of Measurement When we observe and record a variable, it has characteristics that influence the.
AN INTRODUCTION DATA COLLECTION AND TERMS POSTGRADUATE METHODOLOGY COURSE.
Chapter 1 Introduction to Statistics. Statistical Methods Were developed to serve a purpose Were developed to serve a purpose The purpose for each statistical.
Probability & Statistics
Vocabulary of Statistics Part Two. Variable classifications Qualitative variables: can be placed into distinct categories, according to some characteristic.
Applied Quantitative Analysis and Practices
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
POPULATION The set of all things or people being studied A group of people you want information about Examples – All the students of Fairwind – All the.
2- 1 Chapter Two McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Chapter 2: Levels of Measurement. Researchers classify variables according to the extent to which the values of the variable measure the intended characteristics.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Statistics Statistics Data measurement, probability and statistical tests.
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
1.4 Graphs for Quantitative Data Chapter 1 (Page 17)
Chapter 2 Frequency Distributions PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter.
Anthony J Greene1 Distributions of Variables I.Properties of Variables II.Nominal Data & Bar Charts III.Ordinal Data IV.Interval & Ratio Data, Histograms.
2 NURS/HSCI 597 NURSING RESEARCH & DATA ANALYSIS GEORGE MASON UNIVERSITY.
Introduction to Biostatistics Lecture 1. Biostatistics Definition: – The application of statistics to biological sciences Is the science which deals with.
Data, Tables & Graphs October 24, 2016 BIOL 260
The Diminishing Rhinoceros & the Crescive Cow
Data measurement, probability and Spearman’s Rho
Pharmaceutical Statistics
Virtual University of Pakistan
Chapter(2) Frequency Distributions and Graphs
Measurements Statistics
Chapter 2: Methods for Describing Data Sets
Chapter 2 Frequency Distribution and Graph
8.DATA DESCRIPTIVE.
PROBABILITY AND STATISTICS
Learning Aims By the end of this session you are going to totally ‘get’ levels of significance and why we do statistical tests!
Graphical Descriptive Techniques
Chapter 2 Describing Data: Graphs and Tables
Chapter 2 Presenting Data in Tables and Charts
Introduction to Statistics
Frequency Distributions
Vocabulary of Statistics
Overview probability distributions
Objectives (IPS chapter 1.1)
The Nature of Probability and Statistics
Sexual Activity and the Lifespan of Male Fruitflies
Descriptive Statistics
Introduction to Probability and Statistics Thirteenth Edition
Data measurement, probability and statistical tests
Displaying Data – Charts & Graphs
Histograms.
Data, Tables and Graphs Presentation.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
STAT 515 Statistical Methods I Sections
Presentation transcript:

Data types and representation Two types of data: Observations are a finite, countable, number of values Discrete Observations can take on any number of the countless number of values in an interval Continuous When conducting an experiment, we collect data, usually in the form of quantitative variables that are measured or observed. These values we refer to as response variables. Data will be either discrete or continuous. Discrete variables are obtained by counting. There are a finite, or countable number of choices available with discrete data. You can’t have 2.36 people present in the room for example. Continuous variables are usually obtained by measuring. Length, mass, time – these are all examples of continuous variables. Since continuous variables are real numbers, we usually round them in other words, we put a boundary on the number of decimal places. eg. Number of “sixes” after 3 throws of the dice eg. The average IQ of ten random Heads of Department

Data types and representation Types of measurements Nominal Objects are named and assigned to classes eg. male or female Ordinal Objects are either greater, or smaller than, a comparative object eg. finishing positions in a race Ratio level Basic standard interval exists + meaningful zero eg. mass Interval level Basic standard interval is introduced, but no true zero eg. temperature in celsius We generally recognize four data scales, or levels of data. It’s important that you figure out what type of data scale or level you are dealing with, because it will influence the type of analysis you will use. The first type of data we will discuss is referred to as nominal, and refers to categories. An example would be to assign a group of animals to either a male or female group. Although the gender is the actual value here, we are probably going to be more interested in the frequency of each group – in other words, how many of the animals in the group are male, and how many female. The second type of data we will discuss we refer to as ordinal. These are ranks. For example, we rank the position of runners completing a race as first over the finishing line, second over the finishing line and so on. The next two scales have a similarity that allows us to combine them. Ratio scales have a constant interval between successive values, and they also have a true zero (in other words, there are no negative values). By constant difference we mean that the interval between two intervals is the same. For example, the interval between 4 and 6 grams is the same as between 10 and 12 grams. The other type of data on this scale, interval data, also have a constant interval between values, but there is no true zero (in other words, there are negative values). An example would be temperature measured on a celsius scale.

Data types and representation Name Eye Colour Janice brown Tom blue Danielle green Ian Eduardo Emily Anja Cara Adrian Eric Sarah David Summarising discrete data: We are now going to look at some of the different ways in which the different data types are summarised and represented, starting with discrete data. As we’ve already discussed, discrete data are variables that can be counted in integral values. As our example, we have a group of people, whose eye colour we have noted and recorded. This will allow us to look at the frequency of each eye colour within this particular sample of people, whom we are assuming will represent the larger population of people. We will use the frequency of each eye colour in order to address the question: which eye colour is most common among people?

Data types and representation Summarising discrete data: Frequency tables Eye Colour Frequency Brown Blue Green 33 14 3 Here we have counted the number of people with each of the 3 possibilties (in other words, brown, blue or green eyes) and have recorded the frequency of each eye colour in a table. This allows to note immediately that most of the people in our sample had brown eyes, and green eyes are the most scarce eye colour. Frequency distributions are important, because they allow us to determine probablity, which we will discuss at greater length later on.

Data types and representation Summarising discrete data: Eye Colour Frequency Relative Frequency Brown Blue Green 33 14 3 66% 28% 6% Frequency tables An additional way in which we can express the frequency of the different eye colours we noted, is to express the number of people with a particular eye colour as a percentage of the total number of people sampled (in other words, n). We refer to this as the relative frequency.

Data types and representation Summarising discrete data: Frequency bar graph We will often want to represent the frequency distribution of the response variables we recorded in our sample, and we usually use graphical means of representation. The most common is a bar graph, and when depicting discrete data, we make sure the bars do not touch in order to indicate that the data are discrete.

Data types and representation Summarising discrete data: Relative frequency bar graph In the same way, we can use a bar graph in order to depict the relative frequency distribution that exists within our sample.

Data types and representation Name Hours of Sleep / Night Janice 6 Tom 7.5 Danielle 10.5 Ian 9 Eduardo 7 Emily Anja 8 Cara 5 Adrian 8.5 Eric 6.5 Sarah David 4 Summarising continuous data: Here we have a set of continuous data. The number of hours that each person in our sample managed to sleep, was recorded. We want to know: how many hours sleep did most people get per night? How was the amount of sleep per night distributed across the members (in other words, the experimental units) in our sample?

Data types and representation Summarising continuous data: Frequency tables Hours of Sleep Frequency 3 - 4 hrs 4 - 5 hrs 5 - 6 hrs 6 - 7 hrs 7 - 8 hrs 8 - 9 hrs 9 - 10 hrs 10 - 11 hrs 1 3 6 14 16 5 2 By constructing a frequency table, we can answer these questions. First, we arrange our observations in order of lowest to highest. Then we decide on classes of intervals that we want to sort the data into. Here we have used an interval of one hour, starting with 3 hours. We then record the number of people who had between 3 and 4 hours sleep per night. Next is the number of people who slept between 4 and 5 hours per night, and so on, until we have taken account of all of the people within our sample. We can then see immediately how many hours of sleep most people within the sample group slept, per night – in this example, it is between 7 and 8 hours sleep.

Data types and representation Summarising continuous data: Frequency bar graph (Histogram) Again, we are able to depict the frequency with a bar graph, which is specifically referred to as a histogram.

Data types and representation Relative frequency bar chart Summarising continuous data: Relative frequency bar chart We are also able to calculate and graphically represent the relative frequency of the data. The number of people sleeping for a particular length of time are expressed relative to the total number of people in the sample group.

Data types and representation Summarising continuous data: Line graph We can also express continuous data as data points joined by a line. For example, if we were to monitor the glucose in the leaf of a plant over time, we would sample each hour, and then we would express the glucose concentration with a continuous line, to reflect the nature of the data. In other words, that the sampling takes place over a period of time, and is sequential.

Data types and representation Summarising continuous data: Scatter graph Another way in which we can represent continuous data is to use data points that are not linked by a line, which we call scatter data. In the example here, we have the sugar concentration in the blood of a sample of 7 pigs. Each pig is a separate experimental unit, and so we don’t link data points, or our observations, with a line since they are not related in any way.