1 Ch6. Sampling distribution Dr. Deshi Ye
2/38 Outline Population and sample The sampling distribution of the mean ( known) The sampling distribution of the mean ( unknown) The sampling distribution of the variance
3/38 Statistics Descriptive statistics Inferential statistics Remarks: many thanks to Paul Resnick for some slides
4/38 Inferential Statistics 1.Involves: Estimation Hypothesis Testing 2.Purpose Make Inferences about Population Characteristics Population?
5/38 Inference Process Population Sample Sample statistic (X) Estimates & tests
6/38 Key terms Population All items of interest Sample Portion of population Parameter Summary Measure about Population Statistic Summary Measure about sample
7/ Population and Sample Population: refer to a population in term of its probability distribution or frequency distribution. Population f(x) means a population described by a frequency distribution, a probability distribution f(x) Population might be infinite or it is impossible to observe all its values even finite, it may be impractical or uneconomical to observe it.
8/38 Sample Sample: a part of population. Random samples (Why we need?): such results can be useful only if the sample is in some way “representative”. Negative example: performance of a tire if it is tested only on a smooth roads; family incomes based on the data of home owner only.
9/38 Sampling Representative sample Same characteristics as the population Random sample Every subset of the population has an equal chance of being selected
10/38 Random sample Random sample: A set of observations constitutes a random sample of size n from a finite population of size N, if its value are chosen so that each subset of n of the N elements of the population has the same probability of being selected.
11/38 Discussion Ways assuring the selection of a sample is at least approximately random Both finite population and infinite population
12/ The sampling distribution of the Mean ( known) Random sample of n observations, and its mean has been computed. Another random sample of n observation, and also its mean has been computed. Probably no two of them are alike.
13/38 Suppose There’s a Population... Population Size, N = 4 Random Variable, x, Is # Errors in Work Values of x: 1, 2, 3, 4 All values equally likely Estimate based on a sample of two © T/Maker Co.
14/38 Checking list What is the experiment corresponding to random variable X? What is the experiment corresponding to the random variable ? What is “the sampling distribution of the mean”?
15/38 Population Characteristics Population Distribution Summary Measures
16/38 All Possible Samples of Size n = 2 16 Samples Sample with replacement
17/38 All Possible Samples of Size n = 2 16 Samples 16 Sample Means Sample with replacement
18/38 Sampling Distribution of All Sample Means 16 Sample Means Sampling Distribution
19/38 Comparison Population Sampling Distribution
20/38 EX Suppose that 50 random samples of size n=10 are to be taken from a population having the discrete uniform distribution sampling is with replacement, so to speak, so that we sampling from an infinite population.
21/38 Sample means We get 50 samples whose means are
22/38 Theorem If a random sample of size n is taken from a population having the mean and the variance, then is a random variable whose distribution has the mean For samples from infinite populations the variance of this distribution is For samples from a finite population without replacement of size N the variance is
23/38 Central limit theorem If is the mean of a sample of size n taken from a population having the mean and the finite variance, then is a random variable whose distribution function approaches that of the standard normal distribution as
24/38 Central Limit Theorem As sample size gets large enough (n 30)... sampling distribution becomes almost normal.
25/38 EX If a 1-gallon can of paint covers on the average square feet with a standard variation of 31.5 square feet. Question: what is the probability that the sample mean area covered by a sample of 40 of these 1-gallon cans will be anywhere from 510 to 520 square feet?
26/38 Solution We shall have to find the normal curve area between and Check from the cumulative standard normal distribution Table Hence, the probability is
27/38 Another example You’re an operations analyst for AT&T. Long-distance telephone calls are normally distributed with = 8 min. & = 2 min. If you select random samples of 25 calls, what percentage of the sample means would be between 7.8 & 8.2 minutes?
28/38 Solution Sampling Distribution Standardized Normal Distribution
29/38 If n is large, it doesn’t matter whether is known or not, as it is reasonable in that case to substitute for it the sample standard deviation s. Question: how about n is a small value? We need to make the assumption that the sample comes from a normal population. 6.2 The sampling distribution of the Mean ( unknown)
30/38 Assumption: population having normal distribution If is the mean of a random sample of size n taken from a normal population having the mean and the variance, and, then is a random variable having the t distribution with the parameter
31/38 t-distribution
32/38 EX. A manufacturer of fuses claims that with a 20% overload, the fuses will blow in 12.4 minutes on the average. To test this claim, sample of 20 of the fuses was subjected to a 20% overload, and the times it took them to blow had a mean of minutes and a standard deviation of 2.48 minutes. If it can be assumed that the data constitute a random sample from a normal population. Question: do they tend to support or refute the manufacturer’s claim?
33/38 Solution First, we calculate Rule to reject the claim: t value is larger than 2.86 or less than where And
34/ The Sampling distribution of the variance Theorem 6.4. If is the variance of a random sample of size n taken from a normal population having the variance then is a random variable having the chi- square distribution with the parameter
35/38 Chi-square distribution
36/38 F distribution Theorem. If and are the variances of independent random samples of size and, respectively, taken from two normal populations having the same variance, then is a random variable having the F distribution with the parameter
37/38 F distribution
38/38 Thanks!