Download presentation
Presentation is loading. Please wait.
Published byPaula Curtis Modified over 9 years ago
1
1 Measures of Variability Chapter 5 of Howell (except 5.3 and 5.4) People are all slightly different (that’s what makes it fun) Not everyone scores the same on the same scale This is interesting for us - must take it into account The variation tells us about the people we studied
2
2 Example of variability Imagine this variable: 5 7 3 8 2 2 9 1 9 3 The mean is 4.9 We sort of expect 4.9 to be representative of the scores, but: The data is at the edges - not at all close to 4.9!
3
3 A second sample: Look at this one: 4 4 4 5 5 5 5 6 6 The mean is also 4.9 But the distribution: Same mean as before, but the numbers are very clustered close to the mean!
4
4 How do we explain this difference? Both have the same mean the mean obviously doesn’t tell the whole story! What is the actual difference between those data sets? The left one if more “spread out” than the one on the right
5
5 Variability Measures of variability capture this “spreadness” of the data not applicable to nominal variables Various ways to measure it How far does the data stretch? How far, on average, is it spread from the mean?
6
6 Extents of the data - the range The range is the total width of the data Consider x, with a sample 7 4 3 4 5 6 3 These values range all the way from 3 (the smallest value) to 7 (the biggest value) - it’s range is 4 Easy to calculate: range x = max(x) - min(x) (the largest value of x minus the smallest value of x) A high range value means the data is very spread
7
7 Example: calculating the range Calculate the range for x, from the sample: 26 28 32 15 25 12 Step 1 - find the largest value of x in this sample, it is 32 Step 2 - find the smallest value of x in this sample, it is 12 Step 3 - biggest minus smallest 32 - 12 = 20 The range is 20
8
8 Why the range is cool/ why it sucks Gives an idea of how far spread the data is a higher range number means the data is more spread apart Can compare various sample’s ranges to see which is spread the most But: can’t distinguish between these two samples (both have range = 10)
9
9 A better idea of variation The right histogram shows more clustering, but has a few values which “throw off” the range Range can be fooled by “extreme values” - outliers There exist better measures which are “outlier proof”
10
10 Outlier proofing - Varience The varience presents a better measure of data spread not as easily influenced by outliers Varience is based on the average distance of the scores from the mean It is not on the variable’s scale the variance is not in the same units a the variable Still useful - bigger values mean more spread
11
11 Calculating variance (brace yourself) Variance is calculated using a formula: Varience is the mean of the squared deviations of the observations
12
12 Calculating variance (in English) Easy if broken down into 5 small steps! Step 1: Work out the mean of x, and n Step 2: For each data point, work out the deviation (x minus the mean of x) Step 3: For each data point, square the deviations you got above Step 4: Add all the squared deviations together Step 5: Divide your sum by n minus 1
13
13 Example: working out s 2 Work out the variance for x, based on the sample: 16, 12, 15, 14, 20 By the numbers! Step 1: work out the mean and n n is 5 16+12+15+14+20 = 77 77 / 5 = 15.4 The mean is 15.4
14
14 Example: working out s 2 For the remaining steps, make yourself a table: xx-x(x-x) 2 Each column is a step - fill in one at a time
15
15 Example: working out s 2 Step 2: Work out the deviation (x minus mean of x) xx-x(x-x) 2 160.6 12-3.4 15-0.4 14-1.4 204.6
16
16 Example: working the variance Step 3: Square the deviations (column 2 times column 2) xx-x(x-x) 2 160.60.36 12-3.411.56 15-0.40.16 14-1.41.96 204.621.16
17
17 Example: working the variance Step 4: sum the squared deviations 0.36+11.56+0.16+1.96+21.16 = 35.2 Step 5: divide the sum by (n-1) n = 5 n-1 = 4 35.2 / 4 = 8.8 The variance of this data set is 8.8 Simple, but tedious!
18
18 Variance: The bad news Variance is a good measure of spread, but it is in odd units A bigger number means more spread, but the number itself means very little Because we square in the formula, we cause the numbers to loose their scale The variance of an IQ scale is not in IQ points Would be nice to have a measure of variation which is in the correct units!
19
19 The Standard Deviation The standard deviation is a measure variation Has all the good properties of the variance PLUS it is in the same scale as the variable Standard deviation of IQ scores is expressed in IQ points Gives and intuitive understanding of how far apart the scores truly are spread –“Scores were centered at 100 and spread by 15”
20
20 Calculating the standard deviation Very simple formula: To work it out, calculate variance and then take its square root
21
21 Example: working out s Work out the variance for x, based on the sample: 16, 12, 15, 14, 20 Step 1: Work out the variance s 2 = 8.8 (from the previous example) Step 2: find the square root: 8.8 = 2.966The standard dev is 2.966
22
22 Variance and standard deviation If you have variance, it is easy to work out standard deviation Square root the variance If you have the standard deviation, it is easy to work out the variance Square it
23
23 Using the standard deviation with the mean By looking at the mean and std deviation at the same time, we can get a good idea of a variable: Mean: 5.35 Std dev: 1.008 Mean: 5.35 Std dev: 2.3 A B
24
24 Understanding distributions The mean tells us the “middle” of the distribution The standard dev tells us the “spreadness” of the data From this we can derive a lot A low std dev means that everyone scored almost the same A high std dev tells you there was a lot of disagreement
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.