Download presentation
Presentation is loading. Please wait.
1
Data types and representation
Two types of data: Observations are a finite, countable, number of values Discrete Observations can take on any number of the countless number of values in an interval Continuous When conducting an experiment, we collect data, usually in the form of quantitative variables that are measured or observed. These values we refer to as response variables. Data will be either discrete or continuous. Discrete variables are obtained by counting. There are a finite, or countable number of choices available with discrete data. You can’t have 2.36 people present in the room for example. Continuous variables are usually obtained by measuring. Length, mass, time – these are all examples of continuous variables. Since continuous variables are real numbers, we usually round them in other words, we put a boundary on the number of decimal places. eg. Number of “sixes” after 3 throws of the dice eg. The average IQ of ten random Heads of Department
2
Data types and representation
Types of measurements Nominal Objects are named and assigned to classes eg. male or female Ordinal Objects are either greater, or smaller than, a comparative object eg. finishing positions in a race Ratio level Basic standard interval exists + meaningful zero eg. mass Interval level Basic standard interval is introduced, but no true zero eg. temperature in celsius We generally recognize four data scales, or levels of data. It’s important that you figure out what type of data scale or level you are dealing with, because it will influence the type of analysis you will use. The first type of data we will discuss is referred to as nominal, and refers to categories. An example would be to assign a group of animals to either a male or female group. Although the gender is the actual value here, we are probably going to be more interested in the frequency of each group – in other words, how many of the animals in the group are male, and how many female. The second type of data we will discuss we refer to as ordinal. These are ranks. For example, we rank the position of runners completing a race as first over the finishing line, second over the finishing line and so on. The next two scales have a similarity that allows us to combine them. Ratio scales have a constant interval between successive values, and they also have a true zero (in other words, there are no negative values). By constant difference we mean that the interval between two intervals is the same. For example, the interval between 4 and 6 grams is the same as between 10 and 12 grams. The other type of data on this scale, interval data, also have a constant interval between values, but there is no true zero (in other words, there are negative values). An example would be temperature measured on a celsius scale.
3
Data types and representation
Name Eye Colour Janice brown Tom blue Danielle green Ian Eduardo Emily Anja Cara Adrian Eric Sarah David Summarising discrete data: We are now going to look at some of the different ways in which the different data types are summarised and represented, starting with discrete data. As we’ve already discussed, discrete data are variables that can be counted in integral values. As our example, we have a group of people, whose eye colour we have noted and recorded. This will allow us to look at the frequency of each eye colour within this particular sample of people, whom we are assuming will represent the larger population of people. We will use the frequency of each eye colour in order to address the question: which eye colour is most common among people?
4
Data types and representation
Summarising discrete data: Frequency tables Eye Colour Frequency Brown Blue Green 33 14 3 Here we have counted the number of people with each of the 3 possibilties (in other words, brown, blue or green eyes) and have recorded the frequency of each eye colour in a table. This allows to note immediately that most of the people in our sample had brown eyes, and green eyes are the most scarce eye colour. Frequency distributions are important, because they allow us to determine probablity, which we will discuss at greater length later on.
5
Data types and representation
Summarising discrete data: Eye Colour Frequency Relative Frequency Brown Blue Green 33 14 3 66% 28% 6% Frequency tables An additional way in which we can express the frequency of the different eye colours we noted, is to express the number of people with a particular eye colour as a percentage of the total number of people sampled (in other words, n). We refer to this as the relative frequency.
6
Data types and representation
Summarising discrete data: Frequency bar graph We will often want to represent the frequency distribution of the response variables we recorded in our sample, and we usually use graphical means of representation. The most common is a bar graph, and when depicting discrete data, we make sure the bars do not touch in order to indicate that the data are discrete.
7
Data types and representation
Summarising discrete data: Relative frequency bar graph In the same way, we can use a bar graph in order to depict the relative frequency distribution that exists within our sample.
8
Data types and representation
Name Hours of Sleep / Night Janice 6 Tom 7.5 Danielle 10.5 Ian 9 Eduardo 7 Emily Anja 8 Cara 5 Adrian 8.5 Eric 6.5 Sarah David 4 Summarising continuous data: Here we have a set of continuous data. The number of hours that each person in our sample managed to sleep, was recorded. We want to know: how many hours sleep did most people get per night? How was the amount of sleep per night distributed across the members (in other words, the experimental units) in our sample?
9
Data types and representation
Summarising continuous data: Frequency tables Hours of Sleep Frequency 3 - 4 hrs 4 - 5 hrs 5 - 6 hrs 6 - 7 hrs 7 - 8 hrs 8 - 9 hrs hrs hrs 1 3 6 14 16 5 2 By constructing a frequency table, we can answer these questions. First, we arrange our observations in order of lowest to highest. Then we decide on classes of intervals that we want to sort the data into. Here we have used an interval of one hour, starting with 3 hours. We then record the number of people who had between 3 and 4 hours sleep per night. Next is the number of people who slept between 4 and 5 hours per night, and so on, until we have taken account of all of the people within our sample. We can then see immediately how many hours of sleep most people within the sample group slept, per night – in this example, it is between 7 and 8 hours sleep.
10
Data types and representation
Summarising continuous data: Frequency bar graph (Histogram) Again, we are able to depict the frequency with a bar graph, which is specifically referred to as a histogram.
11
Data types and representation Relative frequency bar chart
Summarising continuous data: Relative frequency bar chart We are also able to calculate and graphically represent the relative frequency of the data. The number of people sleeping for a particular length of time are expressed relative to the total number of people in the sample group.
12
Data types and representation
Summarising continuous data: Line graph We can also express continuous data as data points joined by a line. For example, if we were to monitor the glucose in the leaf of a plant over time, we would sample each hour, and then we would express the glucose concentration with a continuous line, to reflect the nature of the data. In other words, that the sampling takes place over a period of time, and is sequential.
13
Data types and representation
Summarising continuous data: Scatter graph Another way in which we can represent continuous data is to use data points that are not linked by a line, which we call scatter data. In the example here, we have the sugar concentration in the blood of a sample of 7 pigs. Each pig is a separate experimental unit, and so we don’t link data points, or our observations, with a line since they are not related in any way.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.