Download presentation
Presentation is loading. Please wait.
1
Statistics for Business and Economics
STT 315: Section 202 Instructor: Han Wang
2
Confidence interval for population mean µ
3
Statistical Inference
Inference means that we are making a conclusion about the population parameter based on the statistic calculated from a sample. Conclusions made using statistical inference are probabilistic in nature. We may not be able to say it for sure, but are able to conclude with certain confidence. There are two types of inference: Confidence intervals Hypothesis tests
4
Setting Suppose we take a random sample of size n from a population with mean µ and standard deviation σ. The sample mean x serves the purpose of point estimator of population mean µ. Goal: To construct a 100(1-α)% confidence interval for population mean µ. α is called significance level; 1-α is called confidence level. However the procedure will depend on whether the sample size n is large enough or not σ is known or unknown
5
Recall: sampling distribution of x
Suppose we draw a random sample from a population with mean µ and standard deviation σ. In this case, the sample mean x has the following properties: Expectation of x is equal to µ. Standard deviation of x is equal to σ/ n . Furthermore, for large sample size (n≥30) x ∼N(µ, σ/ n )
6
Confidence interval for µ
If Z∼N(0,1) then zα/2 is the value such that P(Z>zα/2)=α/2. Thus P(- zα/2 <Z< zα/2)=1− α. Since x ∼N(µ, σ/ n ), we have Z=( x -µ)/(σ/ n )∼N(0, 1). Therefore, we will find that there is roughly 1-α probability that the interval ( x -zα/2×σ/ n , x +zα/2×σ/ n ) contains the true population mean µ.
7
Confidence interval for µ
If the sample size is large enough (i.e., n≥30), then 100(1-α)% confidence interval for µ is: ( x -zα/2×σ/ n , x +zα/2×σ/ n ) if σ is known ( x -zα/2×s/ n , x +zα/2×s/ n ) if σ is unknown where x is the sample mean, s is the sample standard deviation. If sample is not large enough, we need to assume that the population is normally distributed. Use the calculator to compute the confidence interval for µ.
8
Example A sample of 82 MSU undergraduates, the mean number of Facebook friends was friends with standard deviation of friends. Use this information to make a 95% confidence interval for the average number of Facebook friends MSU undergraduates have. Press [STAT] Select [TESTS] Choose ZInterval Select Stats Input the following: σ: x : n: 82 C-Level: 95 Choose Calculate and press [ENTER]. Conclusion: 95% C.I. for µ is (520.19, ).
9
( x -zα/2×σ/ n , x +zα/2×σ/ n )
Margin of Error If the sample is from normally distributed population with known standard deviation σ, then the 100(1-α)% confidence interval for µ is: ( x -zα/2×σ/ n , x +zα/2×σ/ n ) The margin of error: ME=zα/2×σ/ n The width of the confidence interval is: 2ME=2×zα/2×σ/ n Q: How to find zα/2? A: invNorm(1−α/2, 0, 1).
10
Choice of sample size n According to the definition of margin of error, ME=zα/2×σ/ n It is decided by three factors: confidence level 1-α, population standard deviation σ and sample size n. Larger σ, larger ME Larger α, smaller ME Larger n, smaller ME Given the confidence level and standard deviation, we can find the optimal sample size for a particular margin of error using the formula (derived from above): n=(zα/2×σ/ME)2 Note: Always rounding up for the optimal sample size n.
11
Example The number of bolts produced each hour from a particular machine is normally distributed with a standard deviation of 7.4. For a random sample of 15 hours, the average number of bolts produced was Find a 98% confidence interval for the population mean number of bolts produced per hour. Press [STAT] Select [TESTS] Choose ZInterval Select Stats Input the following: σ: 7.4 x : 587.3 n: 15 C-Level: 98 Choose Calculate and press [ENTER] Conclusion: 98% confidence interval for µ is (582.86, ).
12
Example The number of bolts produced each hour from a particular machine is normally distributed with a standard deviation of 7.4. For a random sample of 15 hours, the average number of bolts produced was We want the margin of error for 98% confidence interval for the population mean number of bolts produced per hour to be 3.5. What is the optimal sample size? We found that 98% confidence interval for µ is (582.86, ). Therefore, width= =8.88. Thus, ME=width/2=4.44 Also we know that n=(zα/2×σ/ME)2. α=1-98%=0.02. Hence, zα/2=z0.01=invNorm(1−0.02/2, 0, 1)=invNorm(0.99, 0, 1)= Therefore, n=(2.326×7.4/3.5)2=24.2, the optimal sample size is 25.
13
Confidence interval for µ
When the sample size n<30 and population standard deviation σ is unknown, the previous confidence interval formula of µ cannot be applied. First of all, we should substitute σ by sample standard deviation s. Moreover, unlike the large sample case, we can no longer use zα/2 (i.e., standard normal distribution). Instead, student’s t-distribution comes to rescue. t-distribution is a symmetric and continuous distribution centered around 0. Degree of freedom (df) is attached to each t-distribution. For our problem, df=n-1.
14
t-distribution (tα∕2; df)
If T~tdf, then tα∕2; df is a value such that P(T>tα∕2; df )=α/2 Thus P(-tα∕2; df <T<tα∕2; df)=1−α/2
15
Confidence interval for µ
If the sample is from normally distributed population without knowing σ, then the 100(1-α)% confidence interval for µ is: ( x -tα∕2; n-1×s/ n , x +tα∕2; n-1×s/ n ) x is the sample mean, s is the sample standard deviation The margin of error is: ME=tα∕2; n-1×s/ n The width of the confidence interval is: 2ME Use TInterval from TI 83/84 to compute the confidence interval when σ is unknown.
16
Example The Daytona Beach Tourism Commission is interested in the average amount of money a typical college student spends per day during spring break. They know the daily spending is following a normal distribution and randomly select 25 students to conduct a survey. The mean spending is $63.57 and the standard deviation is $ Develop a 97% confidence interval for the population mean daily spending during the break. Press [STAT] Select [TESTS] Choose TInterval Select Stats Input the following: x : 63.57 Sx: 17.32 n: 25 C-Level: 97 Choose Calculate and press [ENTER] Conclusion: 97% confidence interval for µ is (55.58, 71.56).
17
Confidence interval for population proportion p
18
Example Suppose I want to estimate the percent of MSU undergraduate students who smoke tobacco. A random sample of 99 undergraduate students were selected and 17 of them smoked tobacco last week. We want to make a 95% confidence interval for the proportion of MSU undergraduates based on this information.
19
Example Check out the conditions. Firstly it is a random sample.
Though the sample is without replacement, but it satisfies 10% condition as there are more than undergraduate students at MSU (i.e., sample size is smaller than 10% of the population size). Also both number of smokers (17) and non-smokers (82) are larger than 10, the sample size can be considered to be large enough.
20
Construct a confidence interval
The sampling distribution results guarantee us that the sample proportion ( p ) will be roughly normally distributed around the population proportion (p). So 95% of samples should fall within two standard deviations away from the population proportion. But we do not know the population proportion (that is what we are trying to estimate). Therefore we need to use the sample proportion to work backwards.
21
Construct a confidence interval
In our sample, 17 out of 99 students smoked tobacco in the last week. 17.2% (17/99) is a sample proportion We shall use 17.2% to construct an interval for the value of the parameter (population proportion in this problem). In order to make a 95% confidence interval, we must create an interval that is 2 standard deviations long, above and below the sample proportion. Standard deviation is p (1− p ) n = %(1−17.2%) 99 =0.0379 Therefore, 2 standard deviations is 2(.0379)=.0758 (the margin of error).
22
Construct a confidence interval
Hence, a 95% confidence interval has endpoints at = and = Conclusion: We are 95% confident that between 9.62% and 24.78% of MSU undergraduates smoke tobacco. If we want to make a 68% confidence interval, we only have to extend the interval one standard deviation from the sample proportion in each direction. A 68% confidence interval has endpoints at = and =0.2099 Conclusion: We are 68% sure that between 13.4% and 21.0% of MSU undergraduates smoke tobacco.
23
Interpretation of confidence interval
Our 95% CI for smokers was 9.62% to 24.78%. This means that (find the correct one): 95% of random samples of MSU undergraduates will have between 9.62% and 24.78% smokers. Between 9.62% and 24.78% of MSU undergraduates smoke. 95% of MSU undergraduates smoke between 9.62% and 24.78% of the time. We are 95% sure that between 9.62% and 24.78% of MSU undergraduates smoke.
24
Standard Error If subjects are independent and the sample size is large enough, then the sample proportion is approximately normally distributed with mean p and standard deviation p(1−p) n . Namely, p ∼N(p, p(1−p) n ) But in an estimation problem, p is unknown. So we replace population proportion p by the sample proportion p and get standard error of sample proportion SE( p )= p (1− p ) n .
25
Confidence interval for p
The 100(1-α)% confidence interval for p is given by ( p -zα/2× p (1− p ) n , p +zα/2× p (1− p ) n ) where zα/2 is determined the same as before (i.e., invNorm(1−α/2, 0, 1)), p is the sample proportion, n is the sample size. Margin of error: ME=zα/2× p (1− p ) n We can use the calculator to compute the confidence interval for p.
26
Example revisited Q: Want to make a 85% confidence interval for smokers among MSU undergraduates. In a random sample of 99 MSU undergraduates 17 smoked tobacco last week. Press [STAT] Select [TESTS] Choose 1-PropZInt Input the following: x: 17 n: 99 C-Level: 85 Choose Calculate and press [ENTER] Conclusion: 85% confidence interval for p is (0.117, 0.226).
27
Confidence interval construction
Since for large n, the sample proportion p is approximately normal, we can conclude that within one standard error away from the mean, we are 68% sure that the population proportion will lie in between. within two standard errors away from the mean, we are about 95% sure that the population proportion will be covered.
28
Difference Find the exact area between -2 and 2 standard deviations from the mean on a normal curve using the calculator. Hint: normalcdf(-2,2,0,1)=0.954=95.4% So this is not exactly 95%, slightly more. On the other hand, the exact area between and standard deviations from the mean is normalcdf(-1.96,1.96,0,1)=0.95=95%. Using 1.96, we get the 95% confidence interval for p to be (0.097, 0.246). Note: calculator uses 1.96.
29
Example Sample Input The sample input shows finding a 99% confidence interval with a sample size of 4040 people and 2048 of them are smokers. We would interpret the sample output as: “We are 99% confident that between 48.7% and 52.7% of the population smokes. Sample Output
30
Width of confidence interval
Since the formula of confidence interval for p is ( p -zα/2× p (1− p ) n , p +zα/2× p (1− p ) n )=( p -ME, p +ME) the width of the confidence interval is 2×ME Therefore, if we know the width of the interval, we can compute the ME by ME=width/2. Example: A 90% confidence interval for p is (0.23, 0.37), find the values of the sample proportion and the margin of error of this interval. Solution: Since the width=( )=0.14, so the margin of error is 0.14/2=0.07. Moreover, p -ME=0.23, so p = =0.3
31
Example revisited We found out that 17.2% of a sample of 99 MSU undergraduates had smoked in the past week. Find a 95% confidence interval for the proportion of MSU undergraduates who smoke. The 95% confidence interval is (0.097, 0.246). The width of the interval is ( )=0.15, and so the margin of error is (0.15/2)=0.075. If we want to reduce the margin of error while keeping the same confidence level, we could increase the sample size. Since n=(zα/2)2×p×(1-p)/(ME)2
32
Example revisited If we wanna reduce the margin of error to 4%, at least how many undergraduate students shall we include in the survey? The formula is: n=(zα/2)2×p×(1-p)/(ME)2 Q: What is the value of p? Two cases: No information about p is given, we use p=0.5 as a conservative guess. In our exercise, if nothing about p is known: n=(1.96)2×0.5×(1-0.5)/(4%)2=600.25 So we need 601 subjects to reduce the margin of error to be 4%. If some information about p is known, use that information. Now we use the information of the sample: p=0.172. n=(1.96)2×0.172×( )/(4%)2=341.98 Therefore, we need 342 subjects in the survey.
33
Summary Larger sample size makes smaller margin of error
Larger confidence level makes larger margin of error The level of confidence is the proportion of intervals that will contain the value of the population parameter As long as the conditions are satisfied, the process of confidence interval will work
34
Learning Goal Construct a confidence interval for a proportion
Interpret a confidence interval for a proportion Check conditions for the use of inference about a population proportion independence (or sample less than 10% of population) sample size large enough (successes and failures are both greater than 10). Explain the relationship between the margin of error, sample size, and confidence level.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.