Download presentation
Presentation is loading. Please wait.
Published byTracey McGee Modified over 9 years ago
2
Chapter 6 The Standard Deviation as a Ruler and the Normal Model
3
The Standard Deviation as a Ruler A student got a 67/75 on the first exam and a 64/75 on the second exam. She was disappointed that she did not score as well on the second exam. To her surprise, the professor said she actually did better on the second exam, relative to the rest of the class.
4
The Standard Deviation as a Ruler How can this be? Both exams exhibit variation in the scores. However, that variation may be different from one exam to the next. The standard deviation provides a ruler for comparing the two exam scores.
5
Summarizing Exam Scores Exam 1 Mean: Standard Deviation: Exam 2 Mean: Standard Deviation:
6
Standardizing Variables z has no units (just a number) Puts variables on same scale Center (mean) at 0 Spread (standard deviation) of 1 Does not change shape of distribution
7
Standardized Exam Scores Exam 1 Score: 67 Exam 2 Score: 64
8
Standardized Exam Scores On exam 1, the 67 was 0.87 standard deviations better than the mean. On exam 2, the 64 was 1.17 standard deviations better than the mean.
9
Standardizing Variables z = # of standard deviations away from mean Negative z – number is below mean Positive z – number is above mean
10
Standardizing Variables Height of women Height of men Jill is 69 inches tall Jack is 72 inches tall Who is taller (comparatively)?
11
Standardizing Variables
12
Jill is 1.2 standard deviations above mean height for women Jack is 0.67 standard deviations above mean height for men Jill is taller (comparatively)
13
Normal Distributions Bell Curve Physical Characteristics Examples Heights Weights Lengths of Bird Wings Most important distribution in this class.
14
Normal Distributions Curve is always above the x-axis The tails never reach the x-axis They continue on to infinity and –infinity Area under entire curve = 1 or 100% Mean = Median This means the curve is symmetric
15
Normal Distributions Two parameters (not calculated, i.e. do not come from the data) Mean μ (pronounced “meeoo”) Locates center of curve Splits curve in half Shifts curve along x-axis
16
Normal Distributions Standard deviation σ (pronounced “sigma”) Controls spread of curve Smaller σ makes graph tall and skinny Larger σ makes graph flat and wide Ruler of distribution Write as N( μ,σ)
17
Standard Normal Distribution Puts all normal distributions on same scale z has center (mean) at 0 z has spread (standard deviation) of 1
18
Standard Normal Distribution z = # of standard deviations away from mean μ Negative z, number is below the mean Positive z, number is above the mean Written as N(0,1)
20
Standardizing Y ~ N(70,3). Standardize y = 68. y = 68 is 0.67 standard deviations below the mean
21
Standardizing Notice the difference between Y and y Y denotes an entire distribution; all possible values of the distribution, the shape, the center, the spread y denotes a single value ONLY Can be generalized to all capital letters Z and z X and x
22
Standardizing N(70,3)N(0,1)
23
Standardizing Y ~ N(70,3). Standardize y = 74 y = 74 is 1.33 standard deviations above mean
24
Standardizing N(70,3)N(0,1)
25
Standardizing Y ~ N(65,2.5). Standardize y = 63 Y ~ N(65,2.5). Standardize y = 68
26
Standardizing The problem with standardizing is that it only tells me where the value is on the curve Probabilities are areas under the curve P(Y = 68) = 0 (ALWAYS) However, P(Y > 68) or P(Y 0 How do we find these probabilities?
27
68-95-99.7 Rule 68% of observations are within 1 σ of the mean μ For N(0,1) this is between –1 and 1
28
68-95-99.7 Rule 95% of observations are within 2 σ of the mean μ For N(0,1) this is between –2 and 2
29
68-95-99.7 Rule 99.7% of observations are within 3 σ of the mean μ For N(0,1) this is between –3 and 3
30
68-95-99.7 Rule Given Y (the heights of men) is N(70,3), between what two values are 68% of the data? 68% of the data are between the values 67 and 73 Between what two numbers are 95% of the data? 95% of the data are between the values 64 and 76 Between what two numbers are 99.7% of the data? 99.7% of the data are between the values 61 and 79
31
68-95-99.7 Rule P(Y < 70)? = 0.5 P(Y < 67)? = (1 – 0.68)/2 = 0.16 P(Y < 64)? = (1 – 0.95)/2 = 0.025 P(Y < 61)? = (1 – 0.997)/2 = 0.0015 P(Y > 67)? = 0.5 + (0.68)/2 = 0.84 P(Y > 64)? = 0.5 + (0.95)/2 = 0.975 P(Y > 61)? = 0.5 + (0.997)/2 = 0.9985
32
68-95-99.7 Rule P(Y > 70)? = 0.5 P(Y > 73)? = (1 – 0.68)/2 = 0.16 P(Y > 76)? = (1 – 0.95)/2 = 0.025 P(Y > 79)? = (1 – 0.997)/2 = 0.0015 P(Y < 73)? = 0.5 + (0.68)/2 = 0.84 P(Y < 76)? = 0.5 + (0.95)/2 = 0.975 P(Y < 79)? = 0.5 + (0.997)/2 = 0.9985
33
68-95-99.7 Rule P(67 < Y < 70)? = 0.68/2 = 0.34 P(67 < Y < 76)? = (0.68/2) + (0.95/2) = 0.815 P(67 < Y < 79)? = (0.68)/2 + (0.997)/2 = 0.8385 P(64 < Y < 79)? = (0.95)/2 + (0.997)/2 = 0.9735
34
Areas under curve Another way to find probabilities when values are not exactly 1, 2, or 3 away from µ is by using the Normal Values Table Gives amount of curve below a particular value of z z values range from –3.99 to 3.99 Row – ones and tenths place for z Column – hundredths place for z
35
Finding Values What percent of a standard Normal curve is found in the region Z < -1.50? P(Z < –1.50) Find row –1.5 Find column.00 Value = 0.0668
36
Finding Values P(Z < 1.98) Find row 1.9 Find column.08 Value = 0.9761
37
Finding values What percent of a std. Normal curve is found in the region Z >-1.65? P(Z > -1.65) Find row –1.6 Find column.05 Value from table = 0.0495 P(Z > -1.65) = 0.9505
38
Finding values P(Z > 0.73) Find row 0.7 Find column.03 Value from table = 0.7673 P(Z > 0.73) = 0.2327
39
Finding values What percent of a std. Normal curve is found in the region 0.5 < Z < 1.4? P(0.5 < Z < 1.4) Table value 1.4 = 0.9192 Table value 0.5 = 0.6915 P(0.5 < Z < 1.4) = 0.9192 – 0.6915 = 0.2277
40
Finding values P(–2.3 < Z < –0.05) Table value –0.05 = 0.4801 Table value –2.3 = 0.0107 P(–2.3 < Z < –0.05) = 0.4801 – 0.0107 = 0.4694
41
Finding values Above what z-value do the top 15% of all z-value lie, i.e. what value of z cuts off the highest 15%? P(Z > ?) = 0.15 P(Z < ?) = 0.85 z = 1.04
42
Finding values Between what two z- values do the middle 80% of the obs lie, i.e. what values cut off the middle 80%? Find P(Z < ?) = 0.10 Find P(Z < ?) = 0.90 Must look inside the table P(Z<-1.28) = 0.10 P(Z<1.28) = 0.90
43
Solving Problems The height of men is known to be normally distributed with mean 70 and standard deviation 3. Y ~ N(70,3)
44
Solving Problems What percent of men are shorter than 66 inches? P(Y < 66) = P(Z< ) = P(Z<-1.33) = 0.0918
45
Solving Problems What percent of men are taller than 74 inches? P(Y > 74) = 1-P(Y<74) = 1 – P(Z< ) = 1 – P(Z<1.33) = 1 – 0.9082 = 0.0918
46
Solving Problems What percent of men are between 68 and 71 inches tall? P(68 < Y < 71) = P(Y<71) – P(Y<68) =P(Z< )-P(Z< ) =P(Z<0.33) - P(Z<-0.67) = 0.6293 – 0.2514 = 0.3779
47
Solving Problems Scores on SAT verbal are known to be normally distributed with mean 500 and standard deviation 100. X ~ N(500,100)
48
Solving Problems Your score was 650 on the SAT verbal test. What percentage of people scored better? P(X > 650) = 1 – P(X<650) = 1 – P(Z< ) = 1 – P(Z<1.5) = 1 – 0.9332 = 0.0668
49
Solving Problems To solve a problem where you are looking for y-values, you need to rearrange the standardizing formula:
50
Solving Problems What would you have to score to be in the top 5% of people taking the SAT verbal? P(X > ?) = 0.05? P(X < ?) = 0.95?
51
Solving Problems P(Z < ?) = 0.95? z = 1.645 x is 1.645 standard deviations above mean x is 1.645(100) = 164.5 points above mean x = 500 + 164.5 = 664.5 SAT verbal score: at least 670
52
Solving Problems Between what two scores would the middle 50% of people taking the SAT verbal be? P(x 1 = –? < X < x 2 =?) = 0.50? P(-0.67 < Z < 0.67) = 0.50 x 1 = (-0.67)(100)+500 = 433 x 2 = (0.67)(100)+500 = 567
53
Solving Problems Cereal boxes are labeled 16 oz. The boxes are filled by a machine. The amount the machine fills is normally distributed with mean 16.3 oz and standard deviation 0.2 oz.
54
Solving Problems What is the probability a box of cereal is underfilled? Underfilling means having less than 16 oz. P(Y < 16) = P(Z< ) = P(Z< -1.5) = 0.0668
55
Solving Problems A consumer group wants to the company to change the mean amount of cereal the machine fills so that only 3% of boxes are underfilled. What do we need to change the mean to? P(Y < 16) = 0.03 What is z so that P(Z < ?) = 0.03? z = –1.88
56
Solving Problems 16 must be 1.88 standard deviations below mean. 16 must be 1.88(0.2) = 0.376 below mean Mean = 16 + 0.376 = 16.376
57
Solving Problems Company president feels that is too much cereal to put in each box. She wants to set the mean weight on the machine to 16.2, but only have 3% of the boxes underfilled. How can she do this? Change the standard deviation of the machine.
58
Solving Problems P(Y < 16) = 0.03 What is z so that P(Z < ?) = 0.03? z = –1.88 16 must be 1.88 standard deviations below 16.2 0.2 = 1.88 σ σ = 0.106
59
Are Your Data Normal The histogram should be mounded in the middle and symmetric. The data plotted on a normal probability (quantile) plot should follow a diagonal line.
60
Are Your Data Normal?
61
Are Your Data Normal
62
Check Normal Assumption Let W be the price of 1 1/2 and 2 story houses sold in Ames between 9-2004 and 10-2005 We are told that =$204,500 and that = $92,350. We decide to model the price of homes with a normal model with out plotting any data. Thus we assume that We want to find the percent of homes that sold for more than $350,000 or P(W > 350,000). We find that We then use the table to find that P(W>200,000)=.519 or 51.9%
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.