Variance and Standard Deviation AP Statistics Variance and Standard Deviation
Introduction The 5-number summary is the most useful numerical descriptor of a distribution. The combination of the mean and standard deviation is most commonly used. The standard deviation is a measure of the average spread of each data value from the mean of the data. Standard deviation is the square root of another measure of spread (distance) called the variance.
Example
Example Continued - Variance
Finally…Standard Deviation
Notes on Variance The variance is large if the observations are widely spread about the mean. It is small if the observations are all close to the mean. Variance has a different metric than the original observations. Standard deviation is in the same metric. *Read problems carefully to decide which measure to use!
Why use n – 1 instead of n? “n – 1” are the degrees of freedom and is used when dealing with samples. It corrects for error in sampling. “n” is used when doing calculations for populations, where sampling error doesn’t exist.
Limitations Standard deviation should only be used with the mean, as it measures spread about the mean. s = 0 when there is no spread, which occurs when all the observations have the same value. Otherwise, s>0. Standard deviation gets larger as the spread gets larger. Standard deviation, like the mean, is strongly influenced by extreme observations. Thus it is a non-resistant measure.
More Limitations The five number summary is better when describing skewed distributions! Spreads from left and right of the mean are different.
When to Use The mean and standard deviation are better for symmetrical distributions. Always plot your data. A picture is a better descriptor for data than a numerical summary.
Homework Do the Variance and Standard Deviation Worksheet. (#1 calculate by hand, the rest answer using the 1–var stats function). Chapter 1 Review.