Chapter 11 Problems of Estimation 11.1 Estimation of means 11.2 Estimation of means (unknown variance) 11.3 Skip 11.4 Estimation of proportions
11.1 The Estimation of Means How to estimate the population mean μ, and standard deviation σfrom sample data x1, x2, …, xn? We usually use sample mean to estimate μ and sample standard deviation s to estimate σ. and s are called point estimates.
Point estimate of the mean For a certain sample, sample mean, which is the point estimate of the population mean, is a single number. Since sample means fluctuate from sample to sample, we must expect an error . A point estimate along does not tell us about the possible size of the error.
Interval Estimate—Confidence intervals An interval estimate consists of an interval which will contain the quantity it is supposed to estimate with a specified probability (or degree of confidence). Recall that for large random samples from infinite populations, the sampling distribution of the mean is approximately a normal distribution with So we will utilize some properties of normal distribution to explain a confidence interval.
For a standard normal curve -za/2 za/2 Standard normal Define Za/2 to be such that P(Z > Za/2)=a/2. Hence the area under the standard normal curve between -Za/2 and Za/2 is equal to 1-a. 1-a 0.8 0.9 0.95 0.98 0.99 a /2 0.10 0.05 0.025 0.010 0.005 Za/2 1.282 1.645 1.96 2.326 2.576
For X normal with mean m and standard deviation s, Distribution of m With probability 1-a, deviates from m by no more than This is called maximum error of estimate with probability 1-a.
For X normal with mean m and standard deviation s, .95 .05 Distribution of m The probability is 0.95 that will differ from m by at most or approximately to be “off” either way by at most 1.96 standard errors of the mean.
Maximum error E with probability 1-a With probability 0.95, deviates from μ by no more than (approximately 2 standard error away from the true value) Probability Maximum error E 0.80 0.90 0.95 0.99
Maximum error E with probability The maximum error depends on both the confidence level and sample size! You can determine the sample size according to the confidence level and the maximum error.
Sample size for estimating m How large must our sample to keep our error no more than E with probability 1-a? As s2 increases, n increases. As E decreases, n increases. As our error probability a decreases, n increases.
Confidence Interval for Means After computing sample mean , find a range of values such that 95% of the time the resulting range includes the true value m.
Degree of Confidence The degree of confidence states the probability that the interval will give a correct answer. If you use 95% confidence interval often, in the long run 95% of your intervals will contain the true parameter value. When the method is applied once, you do not know if your interval gave a correct value (95% of the time) or not (5% of the time).
Example 11.1 Suppose we measure specific gravity of a metal, and σ=0.025. Send each of you into the lab to take n=25 measurements:
Example 11.1 95% CI for the mean: If the true value is 2, then about 95% of students will find this is true:
Confidence Intervals 100(1-a)% CI: 80% 90% 95% 99%
Example 11.2 X=breaking strength of a fish line. σ=0.10. In a random sample of size n=10, Find a 95% confidence interval for μ, the true average breaking strength.
Solution: Standard error of the mean: Critical value=1.96; maximum error is CI: from 10.24 to 10.36
Example 11.2 (continued) How large a sample size is needed in order to get a maximum error no more than 0.01with 95% probability if the sample mean is used to estimate the true mean? Solution n=385, always round up!
11.2 Estimation of Means (unknown variance) A sample of size n: x1, x2, …, xn from a normal population with mean μ, and standard deviation, σ. If σis known, with probability
If σis unknown Estimate σby sample standard deviation s The estimated standard error of the mean will be Using the estimated standard error we have a confidence interval of The multiplier needs to be bigger than Za/2 (e.g., 1.96). The confidence interval needs to be wider to take into account the added uncertainty in using s to estimate s. The correct multipliers were figured out by a Guinness Brewery worker.
What is the correct multiplier? “t” 100(1-a)% confidence interval when s is unknown 95% CI =100(1-0.05)% confidence interval when s is unknown
Properties of t distribution The value of ta/2 depends on how much information we have about s. The amount of information we have about s depends on the sample size. The information is “degrees of freedom” and for a sample from one normal population this will be: df=n-1.
t curve and z curve Both the standard normal curve N(0,1) (the z distribution), and all t(k) distributions are density curves, symmetric about a mean of 0, but t distributions have more probability in the tails. You can verify this for yourself by comparing values from Table B with those on the n=infinity line of Table C. As the sample size increases, this decreases and the t distribution more closely approximates the z distribution. By n = 1000 they are virtually indistinguishable from one another.
Critical values of t distribution t table is given in the book (p. 497) It depends on the degrees of freedom as well Df alpha t 5 0.10 1.476 10 0.05 1.812 20 0.01 2.528 25 0.025 2.060
Areas under the curve The area between and is
Confidence interval for the mean when s is unknown With probability Maximum error
Example (ex. 11.16, p 273) Noise level, n=12 74.0 78.6 76.8 75.5 73.8 75.6 77.3 75.8 73.9 70.2 81.0 73.9 Point estimate for the average noise level of vacuum cleaners; 95% Confidence interval
Solution n=12, Critical value with df=11 95% CI:
11.4 The Estimation of Proportions Notation: 1. μ, σ mean and variance p proportion=probability of a success Consider count data: n=# of trials, p=probability of a success
Estimate of p Xi=0, or 1 with probability 1-p or p Mean of Xi =p: population mean X=sum of Xi Sample proportion (mean) X/n p
Example 11.4 Toss a coin 100 times and you get 45 heads Estimate p=probability of getting a head Solution: Is the coin balanced one?
Estimate of p If np≥5 and n(1-p)≥5, then is approximately normal.
Maximum error We have (1-a)100% confidence that the error in our estimate is at most (worst case is p=1/2.)
CI An approximate 100(1-a)% confidence interval for p is
Sample Size The sample size required to have probability 1-a that our error is no more than E is Since p is unknown, you have to estimate it in the formula.
Maximize p(1-p) to get the sample size If you don’t have any prior information about p, then Maximum p(1-p)=1/4
If you know p is somewhere … If then maximum p(1-p)=0.3(1-0.3)=0.21 maximum p(1-p)=0.4(1-0.4)=0.24
How to estimate the maximum Estimate p(1-p) by substitute p with the value closest to 0.5 (0, 0.1), p=0.1 (0.3, 0.4), p=0.4 (0.6, 1.0), p=0.6
Example 11.4 (continued) 95% CI for p 0.3525<p<0.5475 with 95% probability
Example 11.5 (example 11.13 in text) A state highway dept wants to estimate what proportion of all trucks operating between two cities carry too heavy a load 95% probability to assert that the error is no more than 0.04 Sample size needed if p between 0.10 to 0.25 no idea what p is
Solution E=0.04, p=0.25 Round up to get n=451 E=0.04, p(1-p)=1/4 n=601