Download presentation
Presentation is loading. Please wait.
Published byPhilomena Booker Modified over 9 years ago
1
Basic Quantitative Methods in the Social Sciences (AKA Intro Stats) 02-250-01 Lecture 4
2
A Quick Review The entire area under the normal curve can be considered to be a proportion of 1.00The entire area under the normal curve can be considered to be a proportion of 1.00 A proportion of.50 lies to the left of the mean, and a proportion of.50 lies to the right of meanA proportion of.50 lies to the left of the mean, and a proportion of.50 lies to the right of mean
3
Area Under the Normal Distribution and Z-Scores Normal Distribution with z-score points of reference:
4
Properties of Area Under the Normal Distribution Since the normal curve is a bell shape, the proportion of scores between whole z-scores is not equal For example,.3413 of the scores lie between the z-scores of 0 (the mean) and 1 (or -1), while only.1359 of the scores lie between the z-scores of 1 and 2 (or -1 and -2)
5
Properties of Area Under the Normal Distribution Z = -3 -2 -1 0 +1 +2 +3. 3413. 1359.0215.0013
6
Properties of Area Under the Normal Distribution Z-scores* Proportion under the curve -1 to +1.6826 (.3413+.3413) -2 to +2.9544 -3 to +3.9974 -4 to +4 1.0000 *Z-scores are expressed in standard deviation units, i.e., a z-score of -1 represents one standard deviation below (to the left of) the mean
7
Normal Distribution Example A study of 2500 University of Windsor students showed that the average amount of sleep lost in the week prior to writing a statistics exam (in hours) was normally distributed with = 7.79 and = 1.75 (don’t worry, this isn’t real data!) This distribution is shown with the abscissa (x- axis) marked in raw score and z-score units:
8
Normal Distribution Example Z = -3 -2 -1 0 +1 +2 +3. 3413. 1359.0215.0013 X = 2.54 4.29 6.04 7.79 9.54 11.29 13.04 Z = -3 -2 -1 0 +1 +2 +3
9
Example cont. We can see from this diagram that 34.13% of U of W students lost between 6.04 and 7.79 hours of sleep in the week prior to a stats test (between z=-1 and z=0)We can see from this diagram that 34.13% of U of W students lost between 6.04 and 7.79 hours of sleep in the week prior to a stats test (between z=-1 and z=0) 13.59% of students lost between 9.54 and 11.29 hours of sleep in that week (between z=+1 and z=+2)13.59% of students lost between 9.54 and 11.29 hours of sleep in that week (between z=+1 and z=+2) 49.87% of students lost between 2.54 & 7.79 hours of sleep (between z=-3 and z=0) (.0215+.1359+.3413 =.4987 = 49.87%)49.87% of students lost between 2.54 & 7.79 hours of sleep (between z=-3 and z=0) (.0215+.1359+.3413 =.4987 = 49.87%)
10
Properties of Area Under the Normal Distribution The symbol is used to denote the z-score having area (alpha) to its right under the normal curveThe symbol is used to denote the z-score having area (alpha) to its right under the normal curve The proportion of area under the curve between the mean and a z-score can be found with the help of a table (Table E.10, Howell, p. 452) and a little math…The proportion of area under the curve between the mean and a z-score can be found with the help of a table (Table E.10, Howell, p. 452) and a little math… In this example, we want to know the area between the mean and z = 0.20:In this example, we want to know the area between the mean and z = 0.20: Look under the column “mean to z” at z=0.20Look under the column “mean to z” at z=0.20 The proportion = 0.0793The proportion = 0.0793 Therefore,.0793 (or almost 8%) is the proportion of data scores between the mean and the score that has a z score of 0.20Therefore,.0793 (or almost 8%) is the proportion of data scores between the mean and the score that has a z score of 0.20
11
Example cont. This means that the area between the mean and z = 0.20 has an area under the curve of 0.0793:This means that the area between the mean and z = 0.20 has an area under the curve of 0.0793: Z: 0 0.20.0793.4207
12
Example cont. Since half of the normal distribution has an area of.5000, we can determine the area beyond z =.20 by subtracting the area from the mean to z =.20 from.5000:Since half of the normal distribution has an area of.5000, we can determine the area beyond z =.20 by subtracting the area from the mean to z =.20 from.5000: Area beyond z=.20 =.5000 -.0793Area beyond z=.20 =.5000 -.0793 Area beyond z=.20 =.4207Area beyond z=.20 =.4207 (Note: If you look at the “smaller portion” in the table, you will see it’s.4207)(Note: If you look at the “smaller portion” in the table, you will see it’s.4207)
13
Example cont. Since the normal curve is symmetrical, the area between the mean and z = -.20 is equal to the area between the mean and z = +.20:Since the normal curve is symmetrical, the area between the mean and z = -.20 is equal to the area between the mean and z = +.20: Z: -0.20 0 +0.20.0793.4207.0793.4207
14
Normal Distribution Table Table E.10 has 3 columns:Table E.10 has 3 columns: Mean to z Larger portion Smaller portion
15
Table: Mean to z
16
Table: Larger Portion
17
Table: Smaller Portion
18
A Couple of Notes 1) Always report proportions (area under the curve) to four decimal places. This means that if you report an area as a percentage, it will have two decimal places (e.g.,.7943 = 79.43%)1) Always report proportions (area under the curve) to four decimal places. This means that if you report an area as a percentage, it will have two decimal places (e.g.,.7943 = 79.43%) 2) When using Table E.10, be careful not to confuse z=.20 with z=.02 (this is a common mistake)2) When using Table E.10, be careful not to confuse z=.20 with z=.02 (this is a common mistake) 3) Remember that a negative z value has the same proportion under the curve as the positive z value because the normal distribution is symmetrical3) Remember that a negative z value has the same proportion under the curve as the positive z value because the normal distribution is symmetrical 4) When working on z-score problems, it is highly recommended that you draw a normal distribution and plot the mean, x, and their corresponding z-scores4) When working on z-score problems, it is highly recommended that you draw a normal distribution and plot the mean, x, and their corresponding z-scores
19
Another Example! We often want to know what the area between two scores is, as in this example:We often want to know what the area between two scores is, as in this example: Assume that the marks in this class are normally distributed with = 69.5 and = 7.4. What proportion of students have marks between 50 and 80?Assume that the marks in this class are normally distributed with = 69.5 and = 7.4. What proportion of students have marks between 50 and 80?
20
Example: Area Between 2 Scores 1) Calculate the z-scores for X values (50 & 80) z = (50-69.5)/7.4 = -19.5/7.4 = -2.64 z = (80-69.5)/7.4 = 10.5/7.4 = 1.42 2) Find the proportions between the mean and both z-scores (consult Table E.10) z(-2.64) =.4959 is the proportion between the mean and z. z(1.42) =.4222 is the proportion between the mean and z.
21
Example: Area Between 2 Scores Third, add these proportions together to find your answer:Third, add these proportions together to find your answer:.4959 +.4222 =.9181 This means that 91.81% of students have Stats marks between 50 and 80This means that 91.81% of students have Stats marks between 50 and 80
22
Smaller and Larger Portions Smaller portion = proportion in the tailSmaller portion = proportion in the tail Larger portion = proportion in the bodyLarger portion = proportion in the body Using the same data ( = 69.5 and = 7.4) we can calculate areas using the Smaller and Larger Portions in the Normal Distribution table:Using the same data ( = 69.5 and = 7.4) we can calculate areas using the Smaller and Larger Portions in the Normal Distribution table: Find the number of students who have stats marks of less than 80.6Find the number of students who have stats marks of less than 80.6 z = (80.6-69.5)/7.4 = +1.5z = (80.6-69.5)/7.4 = +1.5
23
Larger Portion Area below z = +1.5 = 0.9332Area below z = +1.5 = 0.9332 This means that 93.32% of students had a mark of 80.6 or less in this class
24
Smaller Portion Find the number of students who have marks of 76.93 or better:Find the number of students who have marks of 76.93 or better: z = (76.93-69.5)/7.4 = 1.00z = (76.93-69.5)/7.4 = 1.00 Area in smaller portion =.1587Area in smaller portion =.1587 This means that 15.87% of students in this class had a mark of 76.93 or betterThis means that 15.87% of students in this class had a mark of 76.93 or better
25
Converting Back to X Assume = 30 and = 5, what raw scores correspond to z=-1.00 and z=+1.5?Assume = 30 and = 5, what raw scores correspond to z=-1.00 and z=+1.5?
26
Proportion What proportion of scores lie between z=-1.00 and z=+1.50?What proportion of scores lie between z=-1.00 and z=+1.50? Area from mean to z=-1.00 =.3413Area from mean to z=-1.00 =.3413 Area from mean to z=+1.50 =.4332Area from mean to z=+1.50 =.4332 Add them together to get the proportion that lies between these two z-scores:.3413+.4332 =.7745Add them together to get the proportion that lies between these two z-scores:.3413+.4332 =.7745
27
Finding for Number of Observations In this example, if we know the sample size, (e.g., n=212) we can calculate how many people lie between z=-1.00 and z=+1.50:In this example, if we know the sample size, (e.g., n=212) we can calculate how many people lie between z=-1.00 and z=+1.50: Area between z=-1.00 and z=+1.50 =.7745 (see the last slide)Area between z=-1.00 and z=+1.50 =.7745 (see the last slide) Multiply the proportion by n:Multiply the proportion by n: (.7745)(212) = 164.19 Approximately 164 people
28
And a Little More Finally, we can find a z-score from the table if we know the proportion of scores (i.e., we can work backwards):Finally, we can find a z-score from the table if we know the proportion of scores (i.e., we can work backwards): Suppose the birth weight of newborns is normally distributed with = 7.73 and = 0.83Suppose the birth weight of newborns is normally distributed with = 7.73 and = 0.83 What birth weight identifies the top (heaviest) 10% of newborns?What birth weight identifies the top (heaviest) 10% of newborns?
29
Example cont. Look at Table E.10 and find the z-score that identifies the top proportion of 0.1000: look in the smaller portion column (the tail)Look at Table E.10 and find the z-score that identifies the top proportion of 0.1000: look in the smaller portion column (the tail) z = ?.1000
30
Example cont. Looking in the smaller portion column, we find thatLooking in the smaller portion column, we find that z=1.28 has an area of.1003 z=1.29 has an area of.0985 Which do we pick? Pick the one that is closest to an area of.1000: this is z=1.28Pick the one that is closest to an area of.1000: this is z=1.28
31
Example cont. Now solve for X:Now solve for X: X = (1.28)(0.83) + 7.73 = 1.06 + 7.73 = 8.79 = 1.06 + 7.73 = 8.79 So any weight equal to or greater than 8.79 pounds is in the top 10% of birth weights
32
Probability Everything that can possibly happen has some likelihood of happening: probability is a measure of that likelihoodEverything that can possibly happen has some likelihood of happening: probability is a measure of that likelihood Probability: The quantitative expression of likelihood of occurrenceProbability: The quantitative expression of likelihood of occurrence
33
Probability Probability is a ratio of frequenciesProbability is a ratio of frequencies The numerator (top) is the frequency of the outcome of interestThe numerator (top) is the frequency of the outcome of interest The denominator (bottom) is the frequency of all possible outcomesThe denominator (bottom) is the frequency of all possible outcomes
34
Coin Toss Example If a fair* coin is tossed in the air, it can land on either heads or tailsIf a fair* coin is tossed in the air, it can land on either heads or tails This means a coin has 2 possible outcomesThis means a coin has 2 possible outcomes If we want to know the probability of tossing a fair* coin and having it land on heads, we calculate as follows:If we want to know the probability of tossing a fair* coin and having it land on heads, we calculate as follows: *Note: fair means a normal coin, one that is not weighted differently
35
Coin Toss Frequency of interest Frequency of interest Frequency of all possible outcomes For a coin toss, this is : 12 The probability of the coin landing on heads is: p(heads) = ½, or p(heads) =.5
36
Another Example Suppose there are 90 students in a class, 59 of them are women and 31 are menSuppose there are 90 students in a class, 59 of them are women and 31 are men If one of the students is chosen at random, the probability of choosing a woman is:If one of the students is chosen at random, the probability of choosing a woman is: p(woman) = 59/90
37
More Probability If the entire class was women (e.g., there were no male students), the probability of choosing a woman would be 90/90If the entire class was women (e.g., there were no male students), the probability of choosing a woman would be 90/90 If the entire class was men, the probability of choosing a woman would be 0/90If the entire class was men, the probability of choosing a woman would be 0/90
38
More Probability As a numerical value, probabilities can range from 0.00 to 1.00As a numerical value, probabilities can range from 0.00 to 1.00 The numerator can range from a minimum of 0 to a maximum equal to the denominatorThe numerator can range from a minimum of 0 to a maximum equal to the denominator
39
Express Yourself! Probability can be expressed as a fraction, e.g., p(woman) = 59/90Probability can be expressed as a fraction, e.g., p(woman) = 59/90 Or as a decimal fraction: p(woman) =.6556Or as a decimal fraction: p(woman) =.6556 Although not usually expressed as a percentage (e.g., 65.56%), they often are in popular mediaAlthough not usually expressed as a percentage (e.g., 65.56%), they often are in popular media
40
Probability cont. Even if we do not know the actual observed frequencies (e.g., the number of women), probabilities can be determined theoreticallyEven if we do not know the actual observed frequencies (e.g., the number of women), probabilities can be determined theoretically Without throwing a die, we can deduce the probability of landing on a 5Without throwing a die, we can deduce the probability of landing on a 5
41
Die Example cont. We know the die has 6 sides - 6 possible outcomesWe know the die has 6 sides - 6 possible outcomes We are only interested in one side (the 5), so the probability of landing on a 5 is:We are only interested in one side (the 5), so the probability of landing on a 5 is: p(5) = 1/6 = 0.1667
42
Probability and the Normal Distribution The normal distribution can be thought of as a probability distribution. Here’s how:The normal distribution can be thought of as a probability distribution. Here’s how: We know (from Table E.10) the proportion of scores that fall above or below a given z scoreWe know (from Table E.10) the proportion of scores that fall above or below a given z score If you were to randomly pick a score from a sample of scores, what is the probability that you would pick a score that has a corresponding z score of.40 or greater?If you were to randomly pick a score from a sample of scores, what is the probability that you would pick a score that has a corresponding z score of.40 or greater?
43
Probability and the Normal Distribution The proportion of scores above or below a given z score is the same as the probability of selecting a score above or below the z scoreThe proportion of scores above or below a given z score is the same as the probability of selecting a score above or below the z score e.g., the probability of selecting a score from a normal distribution that has a z score of.40 or greater is.3446 (the area in the smaller portion of z =.40)
44
Example #1 Suppose people’s scores on a personality test are normally distributed with a mean of 50 and a population standard deviation of 10.Suppose people’s scores on a personality test are normally distributed with a mean of 50 and a population standard deviation of 10. If you were to pick a person completely at random, what is the probability that you would pick someone with a score on this personality test that is higher than 60?If you were to pick a person completely at random, what is the probability that you would pick someone with a score on this personality test that is higher than 60?
45
Example #1 Step #1: Write down what you knowStep #1: Write down what you know Step #2: What do you want to find?Step #2: What do you want to find? Step #3: Draw the normal distribution, write in the mean, standard deviation, and the X and shade the area you are looking forStep #3: Draw the normal distribution, write in the mean, standard deviation, and the X and shade the area you are looking for
46
Example #1, Step #3 X: 20 30 40 50 60 70 80
47
Example #1 Step #4: Calculate z score(s)Step #4: Calculate z score(s) Step #5: Use Table E.10 to find the probability of selecting a score in your shaded areaStep #5: Use Table E.10 to find the probability of selecting a score in your shaded area Here we want or Look up the smaller portion of z=1.00
48
Example #1 Step #6: Interpret:Step #6: Interpret: The probability of picking someone at random who has a personality test score of 60 or greater is.1587
49
Example #2 Length of time spent waiting in line to buy tickets at the movies is normally distributed with a mean of 12 minutes and a population standard deviation of 3 minutes.Length of time spent waiting in line to buy tickets at the movies is normally distributed with a mean of 12 minutes and a population standard deviation of 3 minutes. If you go to see a movie, what is the probability that you will wait in line to buy tickets for between 7.5 and 15 minutes?If you go to see a movie, what is the probability that you will wait in line to buy tickets for between 7.5 and 15 minutes?
50
Example #2 Step #1: Write down what you knowStep #1: Write down what you know Step #2: What do you want to find?Step #2: What do you want to find? Step #3: Draw the normal distribution, write in the mean, standard deviation, and both X scores and shade the area you are looking forStep #3: Draw the normal distribution, write in the mean, standard deviation, and both X scores and shade the area you are looking for
51
Example #2, Step #3 X: 3 6 7.5 9 12 15 18 21
52
Example #2 Step #4: Calculate z score(s)Step #4: Calculate z score(s) Step #5: Use Table E.10 to find the probability of selecting a score in your shaded areaStep #5: Use Table E.10 to find the probability of selecting a score in your shaded area Here we want or or Look up the mean to z of z = 1.00 =.3413 Look up the mean to z of z = -1.50 =.4332
53
Example #2 Add the two areas together! (Each represent the mean to z, so adding them together gives you the overall shaded area) =.3413+.4332=.7745Add the two areas together! (Each represent the mean to z, so adding them together gives you the overall shaded area) =.3413+.4332=.7745
54
Example #2 Step #6: Interpret:Step #6: Interpret: The probability of waiting in line to buy tickets at the movie for between 7.5 and 15 minutes is.7745. (Note: This means that you will wait in line for between 7.5 and 15 minutes 77.45% of the time).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.