Chapter 3 Section 3 Measures of variation
Measures of Variation Example 3 – 18 Suppose we wish to test two experimental brands of outdoor paint to see how long will last before fading. Let’s say we have six gallons of each paint to test. We have six cans of each type of paint. Lets find the mean for each brand. Brand A (time in months) Brand B (time in months)
Measures of Variation Brand A ( )/6 =210/6 = 35 months Brand B ( )/6 =210/6 = 35 months Brand A (time in months) Brand B (time in months)
Measures of Variation So even though the means are the same for both brands, the spread, or variation, is quite different. By comparing the ranges of each you can see that Brand B is more consistent.
Measures of Variation So even though the means are the same for both brands, the spread, or variation, is quite different. By comparing the ranges of each you can see that Brand B is more consistent. Range Brand A 60-10=50 Range Brand B 45-25=20
Measures of Variation 1.Find the mean. 2.Subtract the mean from each data value. 3.Square each result. 4.Find the sum of the squares. 5.Divide the sum by N to get the variance. 6.Take the square root of the variance to get the standard deviation.
Measures of Variation
Chapter 3 Section 3 Measures of variance
Variance
The variance is the average of the squared differences between the observations and the mean value For the population: For the sample:
Standard Deviation The Standard Deviation of a data set is the square root of the variance. The standard deviation is measured in the same units as the data, making it easy to interpret.
Computing a standard deviation For the population: For the sample:
Variance and Standard Deviation for Grouped Data 1.Make a table as shown and find the midpoint of each class. 2.Multiply the frequency by the midpoint. 3.Multiply the frequency by the square of the midpoint. 4.Find the sums of B, D, and E. 5.Substitute in the formula. (See next slide) 1.Take the Square root to get the standard deviation. ABCDE CLASSFREQ.MIDPTF XmF (Xm) ^2
Formula
Coefficient of Variation Just divide the standard deviation by the mean and multiply times 100 Computing the coefficient of variation: For the sample For the population
Chapter 3 Section 3 Measures of variance
Measures of Variance The Coefficient of Variance, denoted CVar, is the standard deviation divided by the mean. The result is expressed as a percentage. The coefficient of variance is used when you want to compare standard deviations of two different types of variables.
Coefficient of Variation Just divide the standard deviation by the mean and multiply times 100 Computing the coefficient of variation: For the sample For the population
Measures of Variance
Chebyshev’s Theorem
The theorem states that three-fourths, or 75% of the data values will fall within 2 standard deviations of the mean of the data set. This is a result found by substituting k=2 in the expression. Furthermore, the theorem states that at least eight-ninths, or 88.89%, of the data will fall within 3 standard deviation of the mean.
Chebyshev’s Theorem The theorem can be applied to any distribution regardless of its shape. How to use Chebyshev’s theorem to find out information.
Chebyshev’s Theorem Example 1.The mean price of houses in a certain neighborhood is $50,000, and the standard deviation is $10,000. Find the price range for which at least 75% of the houses will sell. – We do this by adding and subtracting 2 times the standard deviation.
Chebyshev’s Theorem
A survey of local companies found that the mean amount of travel allowance for executives was $0.25 per mile. The standard deviation was $0.02. Using Chebyshev’s theorem, find the minimum percentage of the data that will fall between $0.20 and $0.30.
Chebyshev’s Theorem A survey of local companies found that the mean amount of travel allowance for executives was $0.25 per mile. The standard deviation was $0.02. Using Chebyshev’s theorem, find the minimum percentage of the data that will fall between $0.20 and $0.30.
The Empirical (Normal) Rule Chebyshev’s theorem applies to any distribution regardless of shape. However, when a distribution is Bell-Shaped ( or what is called normal), the following statements, which make up the empirical rule, are true. 1.Approx. 68% of the data values fall within 1 standard deviation of the mean. 2.Approx. 95% of the data values fall within 2 standard deviation of the mean. 3.Approx. 99.7% of the data values fall within 3 standard deviation of the mean.
Chebyshev’s Theorem
Chapter 3 Section 4 Measures of Position
Standard Scores “You can’t compare apples and oranges.” But with Statistics it can be done to some extent. Example Music test and an English exam. – Number of question – Values of each question – And so on
Z Score or Standard Score The z-score uses the mean and the standard deviation Definition – A z score or standard score for a value is obtained by subtracting the mean from the value and dividing the result by the standard deviation. The symbol for the standard score is z. The z-score represent the number of standard deviations away from the mean a value is.
Z Score or Standard Score
Examples
Chapter 3 Section 4 Measures of position
Percentiles Percentiles divide the set into 100 equal parts. Percentiles are used to compare individuals’ test scores with national test scores. Percentiles are not to be confused with the percent grade you receive on a test.
Percentiles
Percentiles Example Systolic Blood Pressure The frequency from the systolic blood pressure readings (in millimeters of mercury, mm Hg) of 200 randomly selected college students is shown here. Construct a percentile graph. A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent
Percentiles Example Steps: Step 1 - Find the cumulative frequencies and place them in column C A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent
Percentiles Example A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent
Percentiles Example Steps: Step 3 - Graph the data, using class boundaries for the x axis and the percentages for the y axis. A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent
Percentiles Example
Percentile Formula
Percentile Example Test Scores A teacher gives a 20-point test to 10 students. The scores are shown here. Find the percentile rank of a score or , 15, 12, 6, 8, 2, 3, 5, 20, 10
Percentile Example
Find the value corresponding to a given percentile. How do we do this?
Percentile Example
Chapter 3 Section 4 Measures of position
Quartiles and Deciles
Outliers – An outlier is an extremely high or low value when compared with the rest of the data values.
Chapter 3 Section 4 Exploratory Data Analysis
Exploratory data analysis is used to examine data to find out what information can be discovered about the data such as the center and the spread.
Exploratory Data Analysis
Information obtained from a boxplot 1. a)If the median is near the center of the box, the distribution is approximately symmetric. b)If the median falls to the left of the center of the box, the distribution is positively skewed. c)If the median falls to the right of the center, the distribution is negatively skewed. 2. a)If the lines are about the same length, the distribution is approximately symmetric. b)If the right line is larger then the left line, the distribution is positively skewed. c)If the left line is larger than the right line, the distribution is negatively skewed.
Exploratory Data Analysis Resistant Statistic – these statistics are less affected by outliers. Median and the interquartile range.