Confidence Intervals Topics: Essentials Inferential Statistics Confidence Interval for a Mean Confidence Intervals Sample Size Sigma Topics: Essentials Inferential Statistics Terminology Margin of Error Za/2 Decision Grid Examples Large Sample Small Sample – Student’s t Distribution Proportion Confidence Interval for a Proportion
Essentials: Confidence Intervals (How sure we are.) Inferential statistics, precision and the margin of error. Obtaining a confidence interval. Za/2 Guinness, Gosset & the Student’s t Distribution Confidence Intervals for large and small samples, and proportions.
Inferential Statistics: 07/16/96 Inferential Statistics: INFERENTIAL STATISTICS: Uses sample data to make estimates, decisions, predictions, or other generalizations about the population. The aim of inferential statistics is to make an inference about a population, based on a sample (as opposed to a census), AND to provide a measure of precision for the method used to make the inference. An inferential statement uses data from a sample and applies it to a population. In what may be more understandable terms: We want to be able to make a statement about a group as a whole, by examining just a portion of the group, and we want to be able to say just how “good” or accurate our statement is. Let’s look at some terminology that we will be using throughout the course: 3##
Some Terminology Estimation – is the process of estimating the value of a parameter from information obtained from a sample. Estimators – sample measures (statistics) that are used to estimate population measures (parameters). Recall: Parameter – numerical characteristic of a population. We will first be looking at confidence intervals for mu. Estimation Example: One out of four Americans is currently dieting. 72% of Americans have flown commercial airlines Two out of 100 college students say Statistics is their favorite subject.
Terminology (cont’d.) Point Estimate – is a specific numerical value estimate of a parameter. Interval Estimate – of a parameter is an interval or range of values used to estimate the parameter. It may or may not contain the actual value of the parameter being estimated. The best point estimate for mu, the population mean, is x-bar, the sample mean.
Terminology (cont’d.) Confidence Level – of an interval estimate of a parameter is the probability that the interval will contain the parameter. Confidence Interval – is a specific interval estimate of a parameter determined by using data obtained from a sample and by using a specific confidence level.
Margin of Error, E The term is called the maximum error of estimate or margin of error. It is the maximum likely difference between the point estimate of a parameter and the actual value of the parameter. It is represented by a capital E.
Za/2 : Areas in the Tails 95% Obtaining a: Convert the Confidence Level to a decimal, e.g. 95% = .95. Then: Za/2 : Areas in the Tails 95% Show 90%, 98%, and 99% .025 .025 -z (here -1.96) z (here 1.96)
Decision Grid Confidence Interval for a Mean Confidence Interval for a Proportion Sample Size Sigma
t or z???? Is Known? yes Use z-interval formula values no matter what the sample size is.* no Use z-interval formula values and replace in the formula with s (sample std. dev.). yes Is n greater than or equal to 30? no Use t-values and s in the formula.** *Variable must be normally distributed when n<30. **Variable must be approximately normally distributed.
Situation #1: Large Samples or Normally Distributed Small Samples A population mean is unknown to us, and we wish to estimate it. Sample size is > 30, and the population standard deviation is known or unknown. OR sample size is < 30, the population standard deviation is known, and the population is normally distributed. The sample is a simple random sample.
Confidence Interval for (Situation #1) EXPLAIN: 1 – alpha Z, alpha/2
Consider The mean paid attendance for a sample of 30 Major League All Star games was $46,970.87, with a standard deviation of $14,358.21. Find a 95% confidence interval for the mean paid attendance at all Major League All Star games.
95% Confidence Interval for the Mean Paid Attendance at the Major League All Star Games “We can be 95% confident that mu, the mean paid attendance for all Major League All Star games is between $41,832.85 and $52,108.89
Minimum Sample Size Needed For an interval estimate of the population mean is given by Where E is the margin of error (maximum error of estimate) Suppose we wish to estimate with 95% confidence, the mean paid attendance at Major League All Star games, and we want to be within $5,000 of the actual amount.
Situation #2: Small Samples A population mean is unknown to us, and we wish to estimate it. Sample size is < 30, and the population standard deviation is unknown. The variable is normally or approximately normally distributed. The sample is a simple random sample. HOWEVER, the Std. Nrml. Dist. (z), is no longer appropriate to use.
Student t Distribution Is bell-shaped. Is symmetric about the mean. The mean, median, and mode are equal to 0 and are located at the center of the distribution. Curve never touches the x-axis. Variance is greater than 1. As sample size increases, the t distribution approaches the standard normal distribution. Has n-1 degrees of freedom. n-1 is sample size minus 1.
Student t Distributions for n = 3 and n = 12 with n = 3 with n = 12 Standard normal Student t distributions have the same general shape and symmetry as the standard normal distribution, but reflect a greater variability that is expected with small samples.
Confidence Interval for (Situation #2) A confidence interval for is given by Demonstrate use of t-table
Consider The mean salary of a sample of n=12 commercial airline pilots is $97,334, with a standard deviation of $17,747. Find a 90% confidence interval for the mean salary of all commercial airline pilots.
90% Confidence Interval for the Mean Salary of Commercial Airline Pilots “We can be 90% confident that mu, the mean salary of all commercial airline pilots is between $88,132.88 and $106,535.12
Situation #3: Confidence Interval for a Proportion A confidence interval for a population proportion p, is given by Where is the sample proportion . n = sample size np and nq must both be greater than or equal to 5.
Consider In a recent survey of 150 households, 54 had central air conditioning. Find the 90% confidence interval for the true proportion of households that have central air conditioning. Here (NOTE both np and nq > 5)
90% C.I. = .36 ± .065 or 90% C.I. = (.295, .425) 90% C.I. = (.295 < p < .425) We can be 90% confident that the true proportion, p, of all homes having central air conditioning is between 29.6% and 42.5%
Minimum Sample Size Needed For an interval estimate of a population proportion is given by Where E is the maximum error of estimate (margin of error) If no approximation of p-hat is known, you should use p-hat = 0.5. This will give a sample size sufficiently large enough to guarantee an accurate prediction.
End of slides