Download presentation
Presentation is loading. Please wait.
Published byCarmel Jefferson Modified over 8 years ago
1
Chapter 3 Section 3 Measures of variation
2
Measures of Variation Example 3 – 18 Suppose we wish to test two experimental brands of outdoor paint to see how long will last before fading. Let’s say we have six gallons of each paint to test. We have six cans of each type of paint. Lets find the mean for each brand. Brand A (time in months) Brand B (time in months) 1035 6045 5030 35 40 2025
3
Measures of Variation Brand A (10+60+50+30+40+20)/6 =210/6 = 35 months Brand B (35+45+30+35+40+25)/6 =210/6 = 35 months Brand A (time in months) Brand B (time in months) 1035 6045 5030 35 40 2025
4
Measures of Variation So even though the means are the same for both brands, the spread, or variation, is quite different. By comparing the ranges of each you can see that Brand B is more consistent.
5
Measures of Variation So even though the means are the same for both brands, the spread, or variation, is quite different. By comparing the ranges of each you can see that Brand B is more consistent. Range Brand A 60-10=50 Range Brand B 45-25=20
6
Measures of Variation 1.Find the mean. 2.Subtract the mean from each data value. 3.Square each result. 4.Find the sum of the squares. 5.Divide the sum by N to get the variance. 6.Take the square root of the variance to get the standard deviation.
7
Measures of Variation
8
Chapter 3 Section 3 Measures of variance
9
Variance
10
The variance is the average of the squared differences between the observations and the mean value For the population: For the sample:
11
Standard Deviation The Standard Deviation of a data set is the square root of the variance. The standard deviation is measured in the same units as the data, making it easy to interpret.
12
Computing a standard deviation For the population: For the sample:
14
Variance and Standard Deviation for Grouped Data 1.Make a table as shown and find the midpoint of each class. 2.Multiply the frequency by the midpoint. 3.Multiply the frequency by the square of the midpoint. 4.Find the sums of B, D, and E. 5.Substitute in the formula. (See next slide) 1.Take the Square root to get the standard deviation. ABCDE CLASSFREQ.MIDPTF XmF (Xm) ^2
15
Formula
16
Coefficient of Variation Just divide the standard deviation by the mean and multiply times 100 Computing the coefficient of variation: For the sample For the population
17
Chapter 3 Section 3 Measures of variance
18
Measures of Variance The Coefficient of Variance, denoted CVar, is the standard deviation divided by the mean. The result is expressed as a percentage. The coefficient of variance is used when you want to compare standard deviations of two different types of variables.
19
Coefficient of Variation Just divide the standard deviation by the mean and multiply times 100 Computing the coefficient of variation: For the sample For the population
20
Measures of Variance
21
Chebyshev’s Theorem
22
The theorem states that three-fourths, or 75% of the data values will fall within 2 standard deviations of the mean of the data set. This is a result found by substituting k=2 in the expression. Furthermore, the theorem states that at least eight-ninths, or 88.89%, of the data will fall within 3 standard deviation of the mean.
23
Chebyshev’s Theorem The theorem can be applied to any distribution regardless of its shape. How to use Chebyshev’s theorem to find out information.
24
Chebyshev’s Theorem Example 1.The mean price of houses in a certain neighborhood is $50,000, and the standard deviation is $10,000. Find the price range for which at least 75% of the houses will sell. – We do this by adding and subtracting 2 times the standard deviation.
25
Chebyshev’s Theorem
26
A survey of local companies found that the mean amount of travel allowance for executives was $0.25 per mile. The standard deviation was $0.02. Using Chebyshev’s theorem, find the minimum percentage of the data that will fall between $0.20 and $0.30.
27
Chebyshev’s Theorem A survey of local companies found that the mean amount of travel allowance for executives was $0.25 per mile. The standard deviation was $0.02. Using Chebyshev’s theorem, find the minimum percentage of the data that will fall between $0.20 and $0.30.
28
The Empirical (Normal) Rule Chebyshev’s theorem applies to any distribution regardless of shape. However, when a distribution is Bell-Shaped ( or what is called normal), the following statements, which make up the empirical rule, are true. 1.Approx. 68% of the data values fall within 1 standard deviation of the mean. 2.Approx. 95% of the data values fall within 2 standard deviation of the mean. 3.Approx. 99.7% of the data values fall within 3 standard deviation of the mean.
29
Chebyshev’s Theorem
30
Chapter 3 Section 4 Measures of Position
31
Standard Scores “You can’t compare apples and oranges.” But with Statistics it can be done to some extent. Example Music test and an English exam. – Number of question – Values of each question – And so on
32
Z Score or Standard Score The z-score uses the mean and the standard deviation Definition – A z score or standard score for a value is obtained by subtracting the mean from the value and dividing the result by the standard deviation. The symbol for the standard score is z. The z-score represent the number of standard deviations away from the mean a value is.
33
Z Score or Standard Score
34
Examples
35
Chapter 3 Section 4 Measures of position
36
Percentiles Percentiles divide the set into 100 equal parts. Percentiles are used to compare individuals’ test scores with national test scores. Percentiles are not to be confused with the percent grade you receive on a test.
37
Percentiles
38
Percentiles Example Systolic Blood Pressure The frequency from the systolic blood pressure readings (in millimeters of mercury, mm Hg) of 200 randomly selected college students is shown here. Construct a percentile graph. A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent 89.5-104.5 24 104.5-119.5 63 119.5-134.5 73 134.5-149.5 26 149.5-164.6 12 164.5-179.5 4 200
39
Percentiles Example Steps: Step 1 - Find the cumulative frequencies and place them in column C A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent 89.5-104.5 24 104.5-119.5 6386 119.5-134.5 73158 134.5-149.5 26184 149.5-164.6 12196 164.5-179.5 4200
40
Percentiles Example A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent 89.5-104.5 24 104.5-119.5 6386 119.5-134.5 73158 134.5-149.5 26184 149.5-164.6 12196 164.5-179.5 4200
41
Percentiles Example Steps: Step 3 - Graph the data, using class boundaries for the x axis and the percentages for the y axis. A Class boundaries B Frequency C Cumulative Frequency D Cumulative Percent 89.5-104.5 24 12 104.5-119.5 638643 119.5-134.5 7315879 134.5-149.5 2618492 149.5-164.6 1219698 164.5-179.5 4200100 200
42
Percentiles Example
43
Percentile Formula
44
Percentile Example Test Scores A teacher gives a 20-point test to 10 students. The scores are shown here. Find the percentile rank of a score or 12. 18, 15, 12, 6, 8, 2, 3, 5, 20, 10
45
Percentile Example
46
Find the value corresponding to a given percentile. How do we do this?
47
Percentile Example
49
Chapter 3 Section 4 Measures of position
50
Quartiles and Deciles
56
Outliers – An outlier is an extremely high or low value when compared with the rest of the data values.
57
Chapter 3 Section 4 Exploratory Data Analysis
58
Exploratory data analysis is used to examine data to find out what information can be discovered about the data such as the center and the spread.
59
Exploratory Data Analysis
62
Information obtained from a boxplot 1. a)If the median is near the center of the box, the distribution is approximately symmetric. b)If the median falls to the left of the center of the box, the distribution is positively skewed. c)If the median falls to the right of the center, the distribution is negatively skewed. 2. a)If the lines are about the same length, the distribution is approximately symmetric. b)If the right line is larger then the left line, the distribution is positively skewed. c)If the left line is larger than the right line, the distribution is negatively skewed.
63
Exploratory Data Analysis Resistant Statistic – these statistics are less affected by outliers. Median and the interquartile range.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.