Download presentation
Presentation is loading. Please wait.
Published byAlfred Maxwell Modified over 9 years ago
1
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Estimation Using a Single Sample (Confidence Intervals!)
2
2 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Branches of Statistics Descriptive statistics – what we’ve done so far. Inferential statistics – what we start today! Using values obtained from a sample (statistics) to predict values for a population (parameters) Confidence intervals Hypothesis testing
3
3 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. A point estimate of a population characteristic is a single number that is based on sample data and represents a plausible value of the characteristic. Point Estimation
4
4 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Examples of Point Estimates The percentage of orange Reese’s Pieces in a random sample of 25. The average length of the Jellyblubbers in a random sample of 25. The median size (diameter) of a random sample of 40 apples. The standard deviation of the ages of a random sample of 125 college students. The variance of the Algebra II grades of a random sample of 200 Algebra II students.
5
5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Examples of Point Estimates - Continued A sample of 200 students at a large university is selected to estimate the proportion of students that wear contact lens. In this sample 47 wore contact lens. Let = the true proportion of all students at this university who wear contact lens. Consider “success” being a student who wears contact lens. What is the point estimate for ?
6
6 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example The statistic is a reasonable choice for a formula to obtain a point estimate for . The statistic is a reasonable choice for a formula to obtain a point estimate for . Such a point estimate is
7
7 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example A sample of weights of 34 male freshman students was obtained. 185161174175202178 202139177170151176 197214283184189168 188170207180167177 166231176184179155 148180194176 If one wanted to estimate the true mean of all male freshman students, you might use the sample mean as a point estimate for the true mean.
8
8 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example – Same Data! After looking at a histogram and boxplot of the data (below) you might notice that the data seems reasonably symmetric with an outlier, so you might use either the sample median or a sample trimmed mean as a point estimate. 260220180140
9
9 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Bias A statistic with mean value equal to the value of the population characteristic being estimated is said to be an unbiased statistic. A statistic that is not unbiased is said to be biased. Sampling distribution of a unbiased statistic Sampling distribution of a biased statistic Original distribution
10
10 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Bias Another way to think of bias is this. An unbiased statistic gives an estimate that is too high the same proportion of the time that it gives an estimate that is too low! Sampling distribution of a unbiased statistic Sampling distribution of a biased statistic Original distribution
11
11 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. What Makes a “Good” Point Estimate? Given a choice between several unbiased statistics that could be used for estimating a population characteristic, the best statistic to use is the one with the smallest standard deviation. Unbiased sampling distribution with the smallest standard deviation, the Best choice.
12
12 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Point Estimates - Summary Unbiased v. Biased Small standard error is good. What is standard error? The standard deviation of the sampling distribution of sample statistics.
13
13 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Confidence Intervals Point estimates are of little value in estimating a parameter. Because of sampling variability we know a point estimate can vary widely and is seldom equal to the actual parameter.
14
14 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Confidence Intervals So…instead, we find a range of values that we can say with some degree of certainty contains the parameter. A confidence interval for a population characteristic (parameter) is an interval of plausible values for the characteristic. It is constructed so that, with a chosen degree of confidence, the value of the characteristic will be captured inside the interval.
15
15 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Confidence Intervals An interval estimate with and associated measure of precision. I am 95% confident that the true proportion of U.S. adults who believe that affirmative action programs should continue is between.499 and.561. I am 93% confident that the true mean number of students per 3 rd hour class at MHS is 25 ± 4. A Better Way! This is called the bound on the error.
16
16 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. More Examples I am 99% confident that the true mean annual radiation exposure for Diablo Canyon Nuclear Power Plant Unit 2 workers is between.412 and.550 rem. I am 90% confident that in 1993 the true mean salary for married men who received MBAs in the late 70s and who were the sole source of family income was between $121,406.03 and $127,613.97. Figure out the bound on the error for each of these.
17
17 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistic ± Bound on the Error Getting the statistic is easy. How do we get the “bound on the error”? We’ll call it “error” for short. Formula Critical value × standard error For p-hat that means: Where z* is based on the “confidence level” (How certain you want to be).
18
18 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Steps to Creating a Confidence Interval for Estimating To create a confidence interval with a 95% level of confidence, we take 95% of the area under the normal curve, right out of the center! 95% Next, calculate the z- scores that define the boundaries of this area. These are the critical values. Z* 1 Z* 2 Actual value of π
19
19 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. The Concept! Only one of the point estimates possible is actually correct. But, if we add or subtract this much… 95% …from every value of p-hat possible. Then all of the resulting intervals created by the p-hats in the shaded region will contain the actual value of π. Z* 1 Z* 2 Actual value of π
20
20 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Continuing the Steps Look up the z-scores that are the boundaries for the middle 95% of the normal curve. 95% They are just additive inverses of each other. This is the critical value for a 95% confidence interval. Z* 1 Z* 2 Actual value of π Find the standard error for p-hat. The critical value times the standard error gives the actual distance from π to the boundaries.
21
21 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. But Wait a Minute!!! We don’t know the value of pi. So use p-hat. And how do we know this is normal? Requirements for creating a z-confidence interval for pi. P-hat must come from a random sample. The sample size must be large enough for n(p-hat) ≥ 10 and n(1 – p-hat) ≥ 10 (This allows us to say that p-hat has an approximately normal distribution and allows us to use p-hat to estimate pi.) The sample must be less than 5% (or 10%) of the population. If these requirements are met then we can proceed.
22
22 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Large-sample Confidence Interval for a Population Proportion 95% z1z1 z2z2
23
23 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Confidence Level The confidence level associated with a confidence interval estimate is the success rate of the method used to construct the interval.method Even though it is written as a percentage, the confidence level is not a probability!
24
24 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. So Now…what does confidence mean in each of these cases? I am 99% confident that the true proportion of MHS students who are “middle children” is between.412 and.550. I am 90% confident that in 1993 the true proportion of married men who received MBAs in the late 70s and who were the sole source of family income was between.12 and.185. What were the p-hats in each of these cases?
25
25 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Reese’s Pieces Our class did an M&M lab. In past years we have done similar labs with Reese’s Pieces. Either way results suggest that even though sample values vary depending on which sample you happen to pick, there seems to be a pattern to the variation. We need more samples to investigate this pattern more thoroughly, however. Since it is time-consuming (and possibly fattening) to literally sample candies, we will use the TI-83 calculator to simulate the process.
26
26 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Reese’s Pieces To perform these simulations we need to suppose that we know the actual value of the parameter. Let us suppose that 45% of the population is orange. Use TI-83 calculator drawing 500 samples of 25 candies each. (Pretend that this is really 500 students, each taking 25 candies and counting the number of orange ones.)calculator randBin(25,.45, 500) L1 L1/25 L2 (This will take time and battery power.) Then look at a display of the sample proportions of orange obtained. And sketch.
27
27 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Reese’s Pieces Record the mean and standard deviation of these sample proportions. Roughly speaking, are there more sample proportions close to the population proportion (which, we said to be.45) than there are far from it?
28
28 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Reese’s Pieces Let us quantify the previous question. Use TI-83 calculator to count how many of the 500 sample proportions are within .10 of.45 (i.e. between.35 and.55). Then repeat for within .20 and for within .30. SortA(L2) Record the results: Number of the 500 sample proportions Percentage of these sample proportions within .10 of.45 within .20 of.45 within .30 of.45 Phone Home!* *E.T. reference
29
29 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Reese’s Pieces Suppose that each of the 500 imaginary students was to estimate the population proportion of orange candies by going a distance of.20 on either side of her/his sample proportion. What percentage of the 500 students would capture the actual population proportion (.45) within this interval?
30
30 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Reese’s Pieces Forgetting that you actually (think you) know the population proportion of orange candies to be.45, suppose that you were one of these 500 imaginary students. Would you have any way of knowing definitively whether your sample proportion was within.20 of the population proportion? Would you be reasonably “confident” that your sample proportion was within.20 of the population proportion? Explain why.
31
31 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. The 95% Confidence Interval
32
32 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example For a project, a student randomly sampled 182 other students at a large university to determine if the majority of students were in favor of a proposal to build a field house. He found that 75 were in favor of the proposal.
33
33 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. = the true proportion of students that favor the proposal. Requirements: 1. It is given to be a random sample. 2. np = 182(0.4121) = 75 >10 and n(1-p)=182(0.5879) = 107 >10 3. It is reasonable to assume that 182 students is less than or equal to the number of students attending a large university (182/.05=3640). 4. I will create a 95% z-confidence interval for . I am 95% confident that the true proportion of students that favor the proposal is between 0.341 and 0.484.
34
34 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Confidence Interval for pi on the TI-84 TI-84
35
35 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. So…The General Confidence Interval Formula for a population proportion The general formula for a confidence interval for a population proportion when 1. p is the sample proportion from a random sample, and 2. The sample size n is large (np 10 and np(1-p) 10) 3. n<.05N is given by
36
36 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Finding a z Critical Value Finding a z critical value for a 98% confidence interval. Looking up the cumulative area or 0.9900 in the body of the table we find z = 2.33 2.33 How would we do this on the calculator?
37
37 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Some Common Critical Values Confidence level z critical value 80%1.28 90%1.645 95%1.96 98%2.33 99%2.58 99.8%3.09 99.9%3.29
38
38 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Terminology Review The standard error of a statistic is the estimated standard deviation of the statistic.
39
39 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Review Terminology The bound on error of estimation, B, associated with a 95% confidence interval is (1.96)·(standard error of the statistic). The bound on error of estimation, B, associated with a confidence interval is (z critical value)·(standard error of the statistic).
40
40 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Sample Size The sample size required to estimate a population proportion to within an amount B with 95% confidence is The value of may be estimated by prior information. If no prior information is available, use = 0.5 in the formula to obtain a conservatively large value for n. Generally one rounds the result up to the nearest integer.
41
41 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Sample Size Calculation Example If a TV executive would like to find a 95% confidence interval estimate within 0.03 for the proportion of all households that watch NYPD Blue regularly. How large a sample is needed if a prior estimate for was 0.15. A sample of 545 or more would be needed. We have B = 0.03 and the prior estimate of = 0.15
42
42 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Sample Size Calculation Example revisited Suppose a TV executive would like to find a 95% confidence interval estimate within 0.03 for the proportion of all households that watch NYPD Blue regularly. How large a sample is needed if we have no reasonable prior estimate for . The required sample size is now 1068. We have B = 0.03 and should use = 0.5 in the formula. Notice, a reasonable ball park estimate for can lower the needed sample size.
43
43 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Another Example A college professor wants to estimate the proportion of students at a large university who favor building a field house with a 99% confidence interval accurate to 0.02. If one of his students performed a preliminary study and estimated to be 0.412, how large a sample should he take. The required sample size is 4032. We have B = 0.02, a prior estimate = 0.412 and we should use the z critical value 2.58 (for a 99% confidence interval)
44
44 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. One-Sample z Confidence Interval for 2. The sample size n is large (generally n 30), and 3. , the population standard deviation, is known then the general formula for a confidence interval for a population mean is given by
45
45 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Notice that this formula works when is known and either 1.n is large (generally n 30) or 2.The population distribution is normal (any sample size. If n is small (generally n < 30) but it is reasonable to believe that the distribution of values in the population is normal, a confidence interval for (when is known) is... One-Sample z Confidence Interval for
46
46 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Find a 90% confidence interval estimate for the true mean fills of catsup from this machine. Example A certain filling machine has a true population standard deviation = 0.228 ounces when used to fill catsup bottles. A random sample of 36 “6 ounce” bottles of catsup was selected from the output from this machine and the sample mean was 6.018 ounces.
47
47 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example I (continued) The z critical value is 1.645 90% Confidence Interval (5.955, 6.081)
48
48 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Unknown [All Size Samples] An Irish mathematician/statistician, W. S. Gosset developed the techniques and derived the Student’s t distributions that describe the behavior of.
49
49 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. t Distributions If X is a normally distributed random variable, the statistic has a “t” distribution where
50
50 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. t Distributions
51
51 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Notice: As df increase, t distributions approach the standard normal distribution. Since each t distribution would require a table similar to the standard normal table, we usually only create a table of critical values for the t distributions. t Distributions
52
52 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
53
53 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. One-Sample t Procedures Suppose that a SRS of size n is drawn from a population having unknown mean . The general confidence limits are and the general confidence interval for is
54
54 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Confidence Interval Example Ten randomly selected shut-ins were each asked to list how many hours of television they watched per week. The results are 8266908475 88809411091 Find a 90% confidence interval estimate for the true mean number of hours of television watched per week by shut-ins.
55
55 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. We find the critical t value of 1.833 by looking on the t table in the row corresponding to df = 9, in the column with bottom label 90%. Computing the confidence interval for is Confidence Interval Example Calculating the sample mean and standard deviation we have n = 10, = 86, s = 11.842
56
56 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. To calculate the confidence interval, we had to make the assumption that the distribution of weekly viewing times was normally distributed. Consider the normal plot of the 10 data points produced with Minitab that is given on the next slide. Confidence Interval Example
57
57 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Notice that the normal plot looks reasonably linear so it is reasonable to assume that the number of hours of television watched per week by shut-ins is normally distributed. P-Value: 0.753 A-Squared: 0.226 Anderson-Darling Normality Test Typically if the p-value is more than 0.05 we assume that the distribution is normal Confidence Interval Example
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.