Chapter 11: Inference for Distributions The Practice of Statistics (Yates)
11.1 Inference for the Mean of a Population Confidence intervals and tests of significance for the mean of a normal population Mean is based on sample mean Population standard deviation is unknown so estimate using sample data Standard Error of the sample mean:
Assumptions for Inference about a Mean Data are a simple random sample (SRS) on size n from the population Observations from the population have a normal distribution with mean and standard deviation both of which are unknown
One-Sample t Statistic and t-Distribution The one-sample t statistic has the t distribution with n – 1 degrees of freedom
Facts about the t Distributions Density curves – similar in shape to standard normal curve Symmetric about zero Bell-shaped Spread – somewhat greater than the standard normal distribution More probability in tails and less in the center Due to substituting the estimate for population standard deviation As the degrees of freedom k increase the t(k) density curve approaches the N(0, 1) curve
One-Sample t Procedures To test the hypothesis Compute the one-sample t statistic Against one of the following
Matched Pairs t Procedures Matched pairs design Subjects are matched in pairs Each treatment given to one subject in each pair Also used in before-and-after observations on the same subjects
Robustness of t Procedures A confidence interval or significance test is called robust if the confidence level or P-value does not change very much when the assumptions of the procedure are violated t procedures are strongly influenced by outliers Skewness affects the t procedures Make a plot to check for outliers and skewness When population is not normal, larger sample sizes improve accuracy (central limit theorem)
Using t Procedures SRS assumption is more important than assumption of Normal distribution Sample size less than 15: use t procedures only if data is nearly normal, without outliers Sample size at least 15: use t procedures except if there are outliers or strong skewness Large samples, : t procedures can be used even in clearly skewed distributions
The Power of the t Test Power: measures the ability of test to detect deviations from the null hypothesis Power of one-sample t test: probability that test will reject null hypothesis when the mean has alternative value To calculate assume fixed level of significance Usually
11.2 Comparing Two Means Two –Sample Problems Goal of inference is to compare the responses to two treatments to compare the characteristics of two populations Have a separate sample from each treatment or each population
Assumptions for Comparing Two Means Two SRSs from two distinct populations The samples are independent One sample has no influence on the other Measure the same variable for both samples Both populations are normally distributed Means and standard deviations of the populations are unknown
Two-Sample z Statistic Normal distribution of the statistic
Two-Sample t Procedures The population standard deviations are unknown
Two-Sample Problems Two-Sample t statistic is used with t critical values in inference Option 1: use procedures based on the statistic t with critical values from a t distribution with degrees of freedom computed from the data Option 2: use procedures based on the statistic t with critical values from a t distribution with degrees of freedom equal to the smaller of