Confidence Intervals Confidence Interval for a Mean Sample Size Sigma Confidence Interval for a Proportion File Information: 25 Slides To Print : You may need to save this to your p: drive or jump drive before printing. Set PRINT WHAT to Handouts. Under HANDOUTS select the number of slides per page. A sample of the layout on a page appears to the right. To change the orientation of the printing, select the PREVIEW button (lower left) and then the Orientation option on the Print Preview menu.
Inferential Statistics: 07/16/96 Inferential Statistics: INFERENTIAL STATISTICS: Uses sample data to make estimates, decisions, predictions, or other generalizations about the population. The aim of inferential statistics is to make an inference about a population, based on a sample (as opposed to a census), AND to provide a measure of precision for the method used to make the inference. An inferential statement uses data from a sample and applies it to a population. In what may be more understandable terms: We want to be able to make a statement about a group as a whole, by examining just a portion of the group, and we want to be able to say just how “good” or accurate our statement is. Let’s look at some terminology that we will be using throughout the course: 2##
Some Terminology Estimation – is the process of estimating the value of a parameter from information obtained from a sample. Estimators – sample measures (statistics) that are used to estimate population measures (parameters). A good estimator should be: unbiased consistent relatively efficient Recall: Parameter – numerical characteristic of a population. We will first be looking at confidence intervals for mu. Estimation Example: One out of four Americans is currently dieting. 72% of Americans have flown commercial airlines Two out of 100 college students say Statistics is their favorite subject.
Terminology (cont’d.) Point Estimate – is a specific numerical value estimate of a parameter. Interval Estimate – of a parameter is an interval or range of values used to estimate the parameter. It may or may not contain the actual value of the parameter being estimated. The best point estimate for mu, the population mean, is x-bar, the sample mean.
Terminology (cont’d.) Confidence Level – of an interval estimate of a parameter is the probability that the interval will contain the parameter. Confidence Interval – is a specific interval estimate of a parameter determined by using data obtained from a sample and by using a specific confidence level.
Situation #1: Large Samples or Normally Distributed Small Samples A population mean is unknown to us, and we wish to estimate it. Sample size is > 30, and the population standard deviation is known or unknown. OR sample size is < 30, the population standard deviation is known, and the population is normally distributed. The sample is a simple random sample.
Confidence Interval for (Situation #1) A confidence interval for is given by EXPLAIN: 1 – alpha Z, alpha/2
Obtaining a: Convert the Confidence Level to a decimal, e. g. 95% C. L Obtaining a: Convert the Confidence Level to a decimal, e.g. 95% C.L. = .95. Then: Za/2 : Areas in the Tails 95% Show 90%, 98%, and 99% .025 .025 -z (here -1.96) z (here 1.96)
Maximum Error of the Estimate The term is called the maximum error of estimate or margin of error. It is the maximum likely difference between the point estimate of a parameter and the actual value of the parameter.
Consider The mean paid attendance for a sample of 30 Major League All Star games was $46,970.87, with a standard deviation of $14,358.21. Find a 95% confidence interval for the mean paid attendance at all Major League All Star games.
95% Confidence Interval for the Mean Paid Attendance at the Major League All Star Games “We can be 95% confident that mu, the mean paid attendance for all Major League All Star games is between $41,832.85 and $52,108.89
Minimum Sample Size Needed For an interval estimate of the population mean is given by Where E is the maximum error of estimate (margin of error) Suppose we wish to estimate with 95% confidence, the mean paid attendance at Major League All Star games, and we want to be within $5,000 of the actual amount.
Situation #2: Small Samples A population mean is unknown to us, and we wish to estimate it. Sample size is < 30, and the population standard deviation is unknown. The variable is normally or approximately normally distributed. The sample is a simple random sample. HOWEVER, the Std. Nrml. Dist. (z), is no longer appropriate to use.
Student t Distribution Is bell-shaped. Is symmetric about the mean. The mean, median, and mode are equal to 0 and are located at the center of the distribution. Curve never touches the x-axis. Variance is greater than 1. As sample size increases, the t distribution approaches the standard normal distribution. Has n-1 degrees of freedom. n-1 is sample size minus 1.
Student t Distributions for n = 3 and n = 12 with n = 3 with n = 12 Standard normal Student t distributions have the same general shape and symmetry as the standard normal distribution, but reflect a greater variability that is expected with small samples.
Confidence Interval for (Situation #2) A confidence interval for is given by Demonstrate use of t-table
Consider The mean salary of a sample of n=12 commercial airline pilots is $97,334, with a standard deviation of $17,747. Find a 90% confidence interval for the mean salary of all commercial airline pilots.
90% Confidence Interval for the Mean Salary of Commercial Airline Pilots “We can be 90% confident that mu, the mean salary of all commercial airline pilots is between $88,132.88 and $106,535.12
Use z-values no matter what Use z-values and s in place t or z???? Is Known? yes Use z-values no matter what the sample size is.* no yes Use z-values and s in place of in the formula. Is n greater than or equal to 30? no Use t-values and s in the formula.** *Variable must be normally distributed when n<30. **Variable must be approximately normally distributed.
Situation #3: Confidence Interval for a Proportion Consider the following: A USA Today Snapshots feature stated that 12% of the pleasure boats in the United States were named Serenity. The parameter 12% is called a proportion. It means that of all pleasure boats in the United States, 12 out of every 100 are named Serenity.
Confidence Interval for a Proportion p A confidence interval for a population proportion p, is given by Where is the sample proportion . n = sample size np and nq must both be greater than or equal to 5.
Consider In a recent survey of 150 households, 54 had central air conditioning. Find the 90% confidence interval for the true proportion of households that have central air conditioning. Here
We can be 90% confident that the true proportion, p, of all homes having central air conditioning is between 29.6% and 42.5%
Minimum Sample Size Needed For an interval estimate of a population proportion is given by Where E is the maximum error of estimate (margin of error) If no approximation of p-hat is known, you should use p-hat = 0.5. This will give a sample size sufficiently large enough to guarantee an accurate prediction.
Confidence and Precision The length of a confidence interval is the difference between the upper bound and lower bound of the interval. The maximum error of estimate (margin of error) is equal to one half the length of the confidence interval. A shorter interval is a more precise interval.