Download presentation
Presentation is loading. Please wait.
1
Uncertainty and confidence
Although the sample mean, , is a unique number for any particular sample, if you pick a different sample you will probably get a different sample mean. In fact, you could get many different values for the sample mean, and virtually none of them would actually equal the true population mean, . x
2
But the sample distribution is narrower than the population distribution, by a factor of √n.
Thus, the estimates gained from our samples are always relatively close to the population parameter µ. n Sample means, n subjects Population, x individual subjects m If the population is normally distributed N(µ,σ), so will the sampling distribution of Xbar be N(µ,σ/√n). But recall the Central Limit Theorem, even in the cases when the population in not normally distributed, for large n, we’ll have ~normality for Xbar.
3
We'll consider two types:
So we’ll use this information to make inferences; i.e., draw conclusions about populations from data in our samples We'll consider two types: Confidence interval estimation Tests of significance In both of these cases, we'll consider our data as either being a random sample from a population or as data from a randomized experiment Start with estimation… there are two situations we'll consider estimating the mean m of a population of measurements estimating the proportion p of Ss in a population of Ss and Fs
4
In either case, we'll construct a confidence interval of the form estimate +/- M.O.E., where M.O.E. = margin of error of the estimator. The MOE gives information on how good the estimate is through the variation in the estimator (its standard error) and through the level of confidence in the confidence interval (through a tabulated value). The standard error of an estimator is its estimated standard deviation (treating the estimator as a statistic with a sampling distribution…) Best estimator of m is and we know from the previous notes that is approximately Best estimator of p is phat and we know that phat is approx.
5
95% of all sample means will be within roughly 2 standard deviations (2*s/√n) of the population parameter m. This implies that the population parameter m must be within roughly 2 standard deviations from the sample average , in 95% of all samples. Red dot: mean value of individual sample This reasoning is the essence of statistical estimation.
6
So using this fact, we construct a 95% confidence interval for m as
Xbar +/- 1.96(s/sqrt(n)) 1.96(s/sqrt(n)) is the M.O.E. NOTE: MOE = (number from a table)*(Std. Error) The figure on the previous page is really important! Be sure you can give a full explanation of what this figure is showing you about confidence intervals In general, we can construct a confidence interval for m with any level of confidence C as Xbar +/- z* (s/sqrt(n)) where z* is the normal quantile giving confidence C.
7
Xbar +/- 1.96(s/sqrt(n)) is the 95% CI for m… if the MOE is too large for our purposes, we may want to increase n… in fact, we can set the MOE equal to any number we'd like and solve for n… do the algebra to get n = (1.96 s/MOE)2 . Substitute z* for 1.96 to get n for other levels of confidence…
8
Some cautions: the data should be a simple random sample from the population outliers can affect x-bar so always look at your data graphically before proceeding if the sample size is small, then be careful… it's probably ok as long as your data is coming from a nice normal distribution the interval we've constructed assumes that the population s.d. is known and this is quite unrealistic … we'll learn later how to deal with this problem.
9
can you write some R code that will compute confidence intervals?
hint: to get the z*, use the qnorm function… try with the body mass data (assume the sd of the population of lean body masses is 9 and the sd of the population of metabolic rates is 250) be prepared to show your results next time…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.