Statistics Lecture 8
zCompleted so far (any material discussed in these sections is fair game): y y y (READ 5.7) y ; 6.6 y zToday: finish 7.3, zREAD 7.4!!! zAssignment #3: 6.2, 6.6, 6.34, 6.78 (interpret the plot in terms of Normality), 7.20, 7.28, 8.14, 8.22, 8.36 zDue: Tuesday, Oct 16
Central Limit Theorem zIn a random sample (iid sample) from any population with mean and standard deviation when n is large, the distribution of the sample mean is approximately normal. zThat is, zThus,
Implications zSo, for random samples, if have enough data, sample mean is approximately normally distributed...even if data not normally distributed zIf have enough data, can use the normal distribution to make probability statements about
Example zA busy intersection has an average of 2.2 accidents per week with a standard deviation of 1.4 accidents zSuppose you monitor this intersection of a given year, recording the number of accidents per week. zData takes on integers (0,1,2,...) thus distribution of number of accidents not normal. zWhat is the distribution of the mean number of accidents per week based on a sample of 52 weeks of data
Example zWhat is the approximate probability that is less than 2 zWhat is the approximate probability that there are less than 100 accidents in a given year?
Statistical Inference (Chapter 8) zWould like to make inferences about a population based on samples zThe fatality rate for a disease is 50%. In controlled study, 100 patients with a disease are given a new drug. Would you conclude that the drug is successful if: y100% of the patients survived y75% of the patients survived y55% of the patients survived y52% of the patients survived
zStatistical inference deals with drawing conclusions about population parameters from the analysis of sample data zEstimation of parameters yEstimate a single value for a parameter (point estimation) yEstimate a plausible range of values for a parameter (interval estimation) zTesting of hypothesis yProcedure for testing whether data supports a hypothesis or theory
Point Estimation zObjective: to estimate a population parameter based on sample data zPoint estimator is a statistic that estimates a population parameter z Standard deviation of the statistic is called the standard error (most of the time)
Example zSample mean: zHow do you estimate the standard error?
zIf have a random sample of size n from a normal population, what is the distribution of the sample mean? zIf the sampling procedure is done repeatedly, what proportion of sample means lie in the interval ?
zWhen estimating with, the 100(1- )% margin of error, d, is the value where 100(1- )% of the sample means will fall in the interval zFor large samples,
Sample Size Calculation zBefore collecting data, should have some desired margin of error, d and an associated probability zBased on this can determine appropriate sample size z zWhat does this sample size guarantee?
Example (8.12) zStandard deviation of heights of 5 year-old boys is 3.5 inches zHow many boys must be sampled if we want to be 90% certain that the population mean height is within 0.5 inches?
Confidence Intervals for the Mean zLast day, introduced a point estimator…a statistic that estimates a population parameter zOften more desirable to present a plausible range for the parameter, based on the data zWe will call this a confidence interval
zIdeally, the interval contains the true parameter value zIn practice, not possible to guarantee because of sample to sample variation zInstead, we compute the interval so that before sampling, the interval will contain the true value with high probability zThis high probability is called the confidence level of the interval
Confidence Interval for for a Normal Population zSituation: yHave a random sample of size n from ySuppose value of the standard deviation is known yValue of population mean is unknown
zLast day we saw that of sample means will fall in the interval: zTherefore, before sampling the probability of getting a sample mean in this interval is zEquivalently,
z The interval below is called a confidence interval for
Example zTo assess the accuracy of a laboratory scale, a standard weight known to be 10 grams is weighed 5 times zThe reading are normally distributed with unknown mean and a standard deviation of grams zMean result is grams zFind a 90% confidence interval for the mean