Measures of Variability. Why are measures of variability important? Why not just stick with the mean?  Ratings of attractiveness (out of 10) – Mean =

Measures of Variability

Why are measures of variability important? Why not just stick with the mean?  Ratings of attractiveness (out of 10) – Mean = 5  Everyone rated you a 5 (low variability) What could we conclude about attractiveness from this?  People’s ratings fell into a range from 1 – 10, that averaged a 5 (high variability) What could we conclude about attractiveness from this?

Measures of Variability Range Interquartile Range Average Deviation Variance Standard Deviation

Measures of Variability Range  The difference between the highest and lowest values in a dataset  Heavily biased by outliers Dataset #1: 5 7 11 Range = 6 Dataset #2: 5 7 11 million Range = 10,999,995

Measures of Variability Interquartile Range  The difference between the highest and lowest values in the middle 50% of a dataset  Less biased by outliers than the Range  Based on sample with upper and lower 25% of the data “trimmed”  However this kind of trimming essentially ignores half of your data – better to trim top and bottom 1 or 5%

Measures of Variability Average Deviation  For each score, calculate deviation from the mean, then sum all of these scores  However, this score will always equal zero Dataset: 19, 16, 20, 17, 20, 19, 7, 11, 10, 19, 14, 11, 6, 11, 14, 19, 20, 17, 4, 11  X = 285 285/20 = 14.25

 Diff. from Mean = 0 0/N = 0 www.randomizer.org DataMeanDiff. from Mean 1914.254.75 1614.251.75 2014.255.75 1714.252.75 2014.255.75 1914.254.75 714.25-7.25 1114.25-3.25 1014.25-4.25 1914.254.75 1414.25-.25 1114.25-3.25 614.25-8.25 1114.25-3.25 1414.25-.25 1914.254.75 2014.255.75 1714.252.75 414.25-10.25 1114.25-3.25

Measures of Variability Variance  Sample Variance (s 2 ) =  (X - ) 2 /(n -1)  Population Variance (σ 2 ) =  (X - ) 2 /N Note the use of squared units! Gets rid of the positive and negative values in our “Diff. from Mean” column before that added up to 0 However, because we’re squaring our values they will not be in the metric of our original scale  If we calculate the variance for a test out of 100, a variance of 100 is actually average variability of 10 pts. (  100 = 10) about the mean of the test

 (Diff. from Mean) 2 = 493.75 Variance = 493.75/(20-1) = 25.99 DataMeanDiff. from Mean (Diff. from Mean) 2 1914.254.7522.56 1614.251.753.06 2014.255.7533.06 1714.252.757.56 2014.255.7533.06 1914.254.7522.56 714.25-7.2552.56 1114.25-3.2510.56 1014.25-4.2518.06 1914.254.7522.56 1414.25-.25.06 1114.25-3.2510.56 614.25-8.2568.06 1114.25-3.2510.56 1414.25-.25.06 1914.254.7522.56 2014.255.7533.06 1714.252.757.56 414.25-10.25105.06 1114.25-3.2510.56

Measures of Variability Standard Deviation  Sample Standard Deviation (s) = √ [  (X - ) 2 /(n -1)]  Population Standard Deviation (σ) = √ [  (X - ) 2 /N] Note that the formula is identical to the Variance except that after everything else you take the square-root! You can interpret the standard deviation without doing any mental math, like you did with the variance  Variance = 25.99  Standard Deviation = √(25.99) = 5.10

Measures of Variability Standard Deviation  Example: Bush/Cheaney – 55% Kerry/Edwards – 40%  Margin of Error = 30% Bush/Cheaney – 25% – 85% Kerry/Edwards – 10% - 70%

Computational Formula for Variability Definitional Formula  designed more to illustrate how the formula relates to the concept it underlies Computational Formula  identical to the definitional formula, but different in form  allows you to compute your variable with less effort  particularly useful with large datasets

Computational Formula for Variability Definitional Formula for Variance:  s 2 = (X – ) 2 N – 1 Computational Formula for Variance:  s 2 = All you need to plug in here is  X 2 and  X Standard deviation still = √ s 2, no matter how it is calculated

Computational Formula for Variability Definitional Formula for Standard Deviation:  s = √ [(X – ) 2 ] [ N – 1 ] Computational Formula for Standard Deviation  s = √ ( )

Computational Formula for Variability Example:  For the following dataset, compute the variance and standard deviation. 1 2 2 3 3 3 4 5   X = 23   X 2 = 77  s 2 = 77 – (23) 2 _____8__ 8 – 1 s 2 = 77 – 66.125 = 1.55 7 Data (X)X2X2 11 24 24 39 39 39 416 525

Measures of Variability What do you think will happen to the standard deviation if we add a constant (say 4) to all of our scores? What if we multiply all the scores by a constant?

Measures of Variability Characteristics of the Standard Deviation  Adding a constant to each score will not alter the standard deviation i.e. add 3 to all scores in a sample and your s will remain unchanged Let’s say our scores originally ranged from 1 – 10  Add 5 to all scores, the new data ranges from 6 – 15  In both cases the range is 9

Measures of Variability  However, multiplying or dividing each score by a constant causes the s to be similarly multiplied or divided by that constant (and s 2 by the square of the constant) i.e. divide each score by 2 and your s will decrease from 10 to 5 in multiplication, higher numbers increase more than lower ones do, increasing the distance between the highest and lowest score, which increases the variability  i.e. 2 x 5 = 10 – difference of 8 pts. 5 x 5 = 25 – difference of 20 pts.

Measures of Variability Characteristics of the Standard Deviation  Generally, the larger the dataset, the smaller the range/standard deviation More scores = more clustering in the middle – REMEMBER: more central scores are more likely to occur

Smaller Dataset s = 3.96482 Larger Dataset s = 2.75609

Graphically Depicting Variability Boxplot/Box-and- Whisker Plot  Median  Hinges/1 st & 3 rd Quartiles  H-Spread  Whisker  Outlier

Graphically Depicting Variability Boxplot/Box-and- Whisker Plot  Median  Hinges/1 st & 3 rd Quartiles  H-Spread  Whisker  Outlier {

Graphically Depicting Variability Boxplot/Box-and- Whisker Plot  Median  Hinges/1 st & 3 rd Quartiles  H-Spread  Whisker  Outlier

Graphically Depicting Variability Percentile – the point below which a certain percent of scores fall  i.e. If you are at the 75 th %ile (percentile), then 75% of the scores are at or below your score

Graphically Depicting Variability Quartile – similar to %ile, but splits distribution into fourths  i.e. 1 st quartile = 0-25% of distribution, 2 nd = 26-50%, 3 rd = 51-75%, 4 th = 76-100%

Graphically Depicting Variability Interpreting a Boxplot/Box-and- Whisker Plot  Off-center median = Non- symmetry  Longer top whisker = Positively-skewed distribution  Longer bottom whisker = Negatively-skewed distribution

Graphically Depicting Variability

Boxplot/Box-and-Whisker Plot  Hinge/Quartile Location = (Median Location+1)/2  Data: 1 3 3 5 8 8 9 12 13 16 17 17 18 20 21 40 Median Location = (16+1)/2 = 8.5 Hinge Location = (8.5+1)/2 = 4.75 (4 since we drop the fraction) Hinges = 5 and 18

Graphically Depicting Variability  H-Spread = Upper Hinge – Lower Hinge H-Spread = 18-5 = 13  Whisker = H-Spread x 1.5 Since the whisker always ends at an actual data point, if we, say calculated the whisker to end at a value of 12, but the data only has a 10 and a 15, we would end the whisker at the 10. Whiskers = 12x1.5 = 19.5 Lower whisker from 5 to 1 Higher whisker from 18 to 21  Outliers Value of 40 extends beyond upper whisker

Graphically Depicting Variability

Measures of Variability. Why are measures of variability important? Why not just stick with the mean?  Ratings of attractiveness (out of 10) – Mean =

Similar presentations

Presentation on theme: "Measures of Variability. Why are measures of variability important? Why not just stick with the mean?  Ratings of attractiveness (out of 10) – Mean ="— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Measures of Variability. Why are measures of variability important? Why not just stick with the mean?  Ratings of attractiveness (out of 10) – Mean =

Similar presentations

Presentation on theme: "Measures of Variability. Why are measures of variability important? Why not just stick with the mean?  Ratings of attractiveness (out of 10) – Mean ="— Presentation transcript:

Similar presentations

About project

Feedback