Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 8 Statistical Inference: Confidence Intervals

Similar presentations


Presentation on theme: "Chapter 8 Statistical Inference: Confidence Intervals"— Presentation transcript:

1 Chapter 8 Statistical Inference: Confidence Intervals
Section 8.1 Point and Interval Estimates of Population Parameters

2 Point Estimate and Interval Estimate
A point estimate is a single number that is our “best guess” for the parameter. An interval estimate is an interval of numbers within which the parameter value is believed to fall. Figure 8.1 A point estimate predicts a parameter by a single number. An interval estimate is an interval of numbers that are believable values for the parameter. Question: Why is a point estimate alone not sufficiently informative?

3 Comparison of Distributions in Chap6 & Chap7

4 About 68% of all observations are within 1 SD (s) of mean (m).
Review: All Normal curves N(m,s) share Rule About 68% of all observations are within 1 SD (s) of mean (m). Called: C=68%, z*≈1 About 95% of all observations are within 2 s of the mean m. Called: C=95%, z* ≈ 2 Almost all (99.7%) observations are within 3 s of the mean. Called: C=99.7%, z* ≈ 3 Going to an example from the book on women’s heights, the mean here was 64.5, standard deviation 2.5 inches. When we talk about the mean and standard deviation with respect to the curve instead of the actual sample, we use different notation. Mu for mean, sigma for sd. If you consider the area under the curve to represent all of the individuals, then you can divide it into chunks to represent parts of the whole. Like if you divided it down the middle, half of the people are in each half. Here it is divided up into parts not through the middle but by lines that are 1, 2 or 3 standard deviations away from the mean. If you look at the center, pink part, it is the area 1 sd on either side of the mean. By definition for normal curves, this area is 68% of the total. So if you know the mean and sd, you also know that 68% of women are between 62 and 67 inches tall. Similarly for the areas defined by lines drawn 2 or 3 sd from the mean. We might want to know what percent of women are over 72 inches tall. That is 3 sd. We can see that 99.7 percent of women are less than 72 or greater than 57. Or that .3 percent of women are really tall or really short. Since the distribution is symmetric, we can divide by two to find the percent of women that are really tall: .15% You need to be able to work problems like I just did - bunch in book. But what if you want to know something not defined by the sd? Like, what percentage of women are taller than 68 inches? Know that half are smaller than And that half of this middle area, 34%, are smaller than 67 inches, so = 84% are smaller than 67, or 16% are larger than 67 inches. But you want to know the proportion larger than 68 inches. You can look this up on a table, but first you have to do something called standardizing. The reason is that although all normal curves share the properties shown above, they differ by their mean and standard deviation. You would have to have a different table for every curve. When you standardize a normal distribution, you change it so the mean is 0 and the sd is 1. Any normal distribution can be standardized. Standard Normal Distribution N(0, 1) Reminder: µ (mu) is the mean of the idealized curve, while x¯ is the mean of a sample. s (sigma) is the standard deviation of the idealized curve, while s is the s.d. of a sample.

5 Confidence levels z*: Critical Value
Confidence intervals contain the population mean m in C% of samples. Different areas under the curve give different confidence levels C. z*: Critical Value z* is related to the chosen confidence level C. C is the area under the standard normal curve between −z* and z*. C The confidence interval is thus: −z* z* Example: For an 80% confidence level C, 80% of the normal curve’s area is contained in the interval.

6 Point estimation versus interval
When population mean (µ) is unknown, it is better to use an interval than a point to estimate it. The theory behind interval estimation looks at the sampling distribution of the statistic. Confidence level C- CI for the population mean µ is : For a particular confidence level, C, the appropriate z* value is given in the last row of Table D. Example: For a 98% confidence level, z*=2.326

7 Specific Confidence Intervals for population mean
99% CI for the population mean µ is : i.e.: C=99%, z*=2.576 95% CI for the population mean µ is : i.e.: C=95%, z*=1.960 90% CI for the population mean µ is : i.e.: C=90%, z*=1.645

8 Link between confidence level and margin of error
The margin of error (MOE) depends on z*. Higher confidence C implies a larger margin of error m (thus less precision in our estimates). A lower confidence level C produces a smaller margin of error m (thus better precision in our estimates). C z* −z* m m

9 Example 1 The average lifetime of 36 randomly selected certain brand TVs is 20 years. Suppose the SD of all TVs is 2 years. Construct a 95% CI for the average lifetime of all TVs from this brand. Find the corresponding MOE. A 95% CI for the average lifetime of all TVs from this brand is: (19.35, 20.65)

10 Example 2 1. The average height of 100 randomly selected UNCW students is 5.9 feet. Suppose the SD of the heights of all students is 1.2 feet. Construct 99%, 95% and 90% CIs for the average height of all students. Find the corresponding MOE. A 99% CI for the average height of all students is: (5.5904, ) A 95% CI for the average height of all students is: (5.6648, ) A 90% CI for the average height of all students is: (5.702, 6.098) Note: Confidence level C gets smaller, CI gets smaller

11 Example 2 (Continue) 1. The average height of 100 randomly selected UNCW students is 5.9 feet. Suppose the SD of the heights of all students is 1.2 feet. Find MOE and construct a 95% CI for average height of all students. Note: Confidence level C gets smaller, CI gets smaller A 95% CI for the average height of all students is: (5.6648, ) 2. (Continue…) Select another set of 100 UNCW students randomly. The average height of second set of 100 students is 5.5 feet. Suppose the SD of the heights of all students is 1.2 feet. Find MOE and construct 95% CIs for average height of all students. A 95% CI for the average height of all students is: (5.2648, )

12 Outlines for Z* Z* depends on the level of confidence C.
What does “confidence” mean? This idea is only true for simple random samples and completely randomized experiments. Margin of error (MOE): Z*/√(n)

13 Understanding of Confidence Intervals
With 95% confidence, we can say that µ should be within roughly 2 standard deviations (that is, 2*s/√n) from our sample mean . About 95% of all possible samples of this size n, µ will indeed fall in our confidence interval. About only 5% of samples would be farther from µ.

14 Summary to Confidence Interval
If Confidence level C gets larger and n stays the same, what will happen to z*, MOE, CI, and prediction precision? If Z* and  stay the same, when n goes bigger, what will happen to MOE and CI?

15 Point Estimate and Interval Estimate

16 Confidence Interval A confidence interval is an interval containing the most believable values for a parameter. The probability that this method produces an interval that contains the parameter is called the confidence level. This is a number chosen to be close to 1, most commonly 0.95. Standard Error, or SE

17 The Logic behind Constructing a Confidence Interval
Fact: Approximately 95% of a normal distribution falls within 2 standard deviations of the mean. With probability 0.95, the sample proportion falls within about 1.96 standard errors of the population proportion. With probability 0.95, the sample average falls within about 1.96 standard errors of the population average. The distance of 1.96 standard errors is the margin of error in calculating a 95% confidence interval for the population proportion.

18 Margin of Error The margin of error measures how accurate the point estimate is likely to be in estimating a parameter. It is a multiple of the standard error of the sampling distribution when the sampling distribution is a normal distribution. The distance of 1.96 standard errors in the margin of error for a 95% confidence interval for a parameter from a normal distribution.

19 Chapter 8 Statistical Inference: Confidence Intervals
Section 8.3 Constructing a Confidence Interval to Estimate a Population Mean

20 Inference for the mean of a population
So far, we have assumed that s was known. If s is unknown, we can use the sample standard deviation (s), to estimate s. But this adds more variability to our test statistic and/or confidence interval (therefore, we will use the t-table). If s is known, then s /n is known as standard deviation of x. If s is not known, then s/n is known as standard error of x. When s is not known, we use the t-table (Table D) instead of the Normal Table A (or say Z-table).

21 Confidence Interval for a Population Mean
The confidence interval again has the form: Point estimate margin of error

22 The t-distribution When n is very large, s is a very good estimate of s and the corresponding t distributions are very close to the normal distribution. The t distributions become wider for smaller sample sizes, reflecting the lack of precision in estimating s from s. Need degree of freedom (say: df). In the “one sample problem with sample size n”, df = n-1. As df increases, the t-distribution gets closer to the standard normal. Table for t-distribution is Table D or use tcdf(start, end, df)

23 Summary: Properties of the t-Distribution
The t-distribution is bell shaped and symmetric about 0. The probabilities depend on the degrees of freedom, . The t-distribution has thicker tails than the standard normal distribution, i.e., it is more spread out.

24 How to find the p-value for t-distribution with TI83?
pressing [2nd] [VARS]. tcdf(start, end, df), where df=degree-freedom Select [6:tcdf] Left-tailed test (H1: μ < some number) 1.Let our test statistics be –2.05 and n =16, so df = 15. 2. The p-value would be the area to the left of –2.05 or P(t < -2.05) 3. Notice the p-value is .0291, we would type in tcdf(-E99, -2.05,15) to get the same p-value. Right-tailed test (H1: μ > some number) 1.Let our test statistics be 2.05 and n =16, so df = 15. 2. The p-value would be the area to the right of 2.05 or P(t >2.05) 3. Notice the p-value is .0291, we would type in tcdf(2.05, E99, 15) to get the same p-value. Two-tailed test (H1: μ ≠ some number) 2. The p-value would be double the area to the left of –2.05 or 2*P(t < -2.05) 3. Notice the p-value is .0582, we would type in 2*tcdf(-E99, -2.05,15) to get the same p-value.

25 Confidence Interval when s is unknown
When s is unknown, the confidence interval is given as In order to find t*, we need to use Table D: Eg: find out t critical value with confidence level 95% and df=25. Key: t*=2.060; EX1: t=1.812 EX2: t=2.069 EX3: t=2.819 C

26 Confidence Interval when s is unknown
When s is unknown, the confidence interval is given as In order to find t*, we need to use Table D: 1.find out t critical value with confidence level 90% and df 10. 2.find out t critical value with confidence level 95% and df 15. 3.find out t critical value with confidence level 99% and df 20. EX1: t=1.812 EX2: t=2.069 EX3: t=2.819 Key: t*=1.812; t*=2.131; t*=2.845.

27 Table D When σ is unknown, we use a t distribution with “n−1” degrees of freedom (df). Table D shows the z-values and t-values corresponding to landmark P-values/ confidence levels. When σ is known, we use the normal distribution and the standardized z-value.

28 To find the t-critical value
Use Ti84, Press the [2nd] and [VARS] keys to get to the DISTR menu; Select the 4th choice, invT. The format for this command is invT(area of the left tail, df), Exercise: 1.find out t critical value with confidence level 95% and df 25. 2.find out t critical value with confidence level 90% and df 15. 3.find out t critical value with confidence level 99% and df 20. 4.find out t critical value with confidence level 99% and df 50.

29 The Standard Normal Distribution is the t-Distribution with
Table 8.5 Part of Table B Displaying t-Scores for Large df Values. The z-score of 1.96 is the t-score with right-tail probability of and

30 Using the t Distribution to Construct a Confidence Interval for a Mean
Summary: 95% Confidence Interval for a Population Mean When the standard deviation of the population is unknown, a 95% confidence interval for the population mean is: To use this method, you need: Data obtained by randomization An approximately normal population distribution

31 Example: Confidence intervals for m
Ex2: A random sample of 16 school-age girls were selected, their average time per weekday spent on housework is 14 minutes with sample SD 8.6 minutes. Construct a 95% CI for the average time spent on housework of school-age girls in the nation. Ex3: The average lifetime of 9 randomly selected certain brand TVs is 20 years with sample SD 2 years. Construct a 99% CI for the average lifetime of all TVs from this brand

32 Chapter 8 Statistical Inference: Confidence Intervals
Section 8.2 Constructing a Confidence Interval to Estimate a Population Proportion

33 Sampling Distribution of -- Review
If data are obtained from a SRS and np>10 and n(1-p)>10, then the sampling distribution of has the following form: is approximately normal with mean p and standard deviation

34 Large-sample confidence interval for p
The theory behind interval estimation looks at the sampling distribution of the statistic. General form for the Confidence Interval for population proportion p is:

35 Large-sample confidence interval for p
Z* −Z* m m C is the area under the standard normal curve between −z* and z*.

36 Specific Confidence Intervals for population proportion
99% CI for the population proportion p is : 95% CI for the population proportion p is : 90% CI for the population proportion p is :

37 Finding the 95% Confidence Interval for a Population Proportion
We symbolize a population proportion by p. The point estimate of the population proportion is the sample proportion. We symbolize the sample proportion by , called “p-hat”. A 95% confidence interval uses a margin of error = 1.96(standard deviation) CI = [point estimate margin of error] =

38 Example: Paying Higher Prices to Protect the Environment
In 2010, the GSS asked subjects if they would be willing to pay much higher prices to protect the environment. Of n = 1,361 respondents, 637 were willing to do so. Find and interpret a 95% confidence interval for the population proportion of adult Americans willing to do so at the time of the survey.

39 Example: Paying Higher Prices to Protect the Environment
“Of n = 1,361 respondents, 637 were willing to do so” The sample proportion is The standard error of the sample proportion = Using this se, a 95% confidence interval for the population proportion is We are 95% confident that the population proportion is in (0.441,0.494)

40 Sample Size Needed for Validity of Confidence Interval for a Proportion
For the 95% confidence interval for a proportion p to be valid, you should have at least 15 successes and 15 failures:

41 How Can We Use Confidence Levels Other than 95%?
“95% confidence“ means that there’s a 95% chance that a sample proportion value occurs such that the confidence interval contains the unknown value of the population proportion, p. With probability 0.05, however, the method produces a confidence interval that misses p, and the inference is incorrect. Margin of error or MOE

42 Example 2 Among 80 mice inoculated with a serum, 60 are protected from a certain disease. Q: Construct a 95% CI for the percentage of inoculated mice that are protected from the certain disease. A 95% CI is: (0.655, 0.845)

43 Why 95% use z=1.96 and what z-scores shall we use for other confidence levels?
It’s because that 95% of prob are with 1.96se of the population average, and z=1.96 is the 97.5th percentile of N(0,1). For 90% CI, we have 90% prob in the center, 10% for both two tails, and thus we are looking for 90+10/2=95th percentile of N(0,1) What about the z-score for 99% CI

44 What is the Error Probability for the Confidence Interval Method?
The general formula for the confidence interval for a population proportion is: Sample Proportion (z-score)(std. error) which in symbols is Table 8.2 Table 8.2 z -Scores for the Most Common Confidence Levels. The large-sample confidence interval for the population proportion is

45 Example 3 Among 200 cars randomly selected in the city, 60 fail in the inspection. Q: Construct 99%, 95% and 90% CIs for the percentage of cars fail in the inspection. A 99% CI is: ( , )=(0.216, 0.384) A 95% CI is: ( , )=(0.236, 0.364) A 90% CI is: ( , )=(0.247, 0.353)

46 How Can We Use Confidence Levels Other than 95%?
In practice, the confidence level 0.95 is the most common choice. But, some applications require greater (or less) confidence. To increase the chance of a correct inference, we can use a larger confidence level, such as 0.99. Figure 8.4 A 99% Confidence Interval Is Wider Than a 95% Confidence Interval. Question: If you want greater confidence, why would you expect a wider interval?

47 Effects of Confidence Level and Sample Size on Margin of Error
The margin of error ( ) for a confidence interval: Increases as the confidence level increases Decreases as the sample size increases For instance, a 99% confidence interval is wider than a 95% confidence interval, and a confidence interval with 200 observations is narrower than one with 100 observations at the same confidence level. These properties apply to all confidence intervals, not just the one for the population proportion.

48 Chapter 8 Statistical Inference: Confidence Intervals
Section 8.4 Choosing the Sample Size for a Study

49 Sample Size for Estimating a Population Proportion
To determine the sample size, first, we must decide on the desired margin of error. second, we must choose the confidence level for achieving that margin of error. In practice, 95% confidence intervals are most common.

50 Summary: Sample Size for Estimating a Population Proportion
The random sample size n for which a confidence interval for a population proportion p has margin of error MOE (such as MOE = 0.04) can be solved by using The z-score is based on the confidence level, such as z=1.96 for 95% confidence. You may guess the population portion p by using the sample proportion .

51 Example1: Sample Size For Exit Poll
An election is expected to be close(50/50). How large should the sample size be for the margin of error of a 95% confidence interval to equal 0.02? Moe= 0.02, 95% indicates z=invnorm(.975)=1.96, and therefore:

52 Example2: sample size calculation
A past survey of shoppers indicate that 63% of them use credit cards. In the next survey, how many shoppers need to be randomly selected, in order to estimate the proportion with 4% with 92% confidence? Solution: MOE=0.04, z=invnorm(0.96), therefore…

53 Choosing the Sample Size for Estimating a Population Mean
The random sample size n for which a confidence interval for a population mean has margin of error MOE is based on where the t-critical value is based on the confidence level and degree of freedom.

54 Example 1 Weights of women in one age group are normally distributed with a standard deviation of 20 lb. A researcher wishes to estimate the mean weight of all women in this age group. Q: Find how large a sample must be drawn in order to be 90% confident that the sample mean will not differ from the population mean by more than 3.5

55 Example 2 Weights of women in one age group are normally distributed with a standard deviation of 20 lb. A researcher wishes to estimate the mean weight of all women in this age group. Q: Find how large a sample must be drawn in order to be 88% confident that the sample mean will not differ from the population mean by more than 5


Download ppt "Chapter 8 Statistical Inference: Confidence Intervals"

Similar presentations


Ads by Google