Sampling Distributions

Sampling Distributions
Ch04 Sampling Distributions

CHAPTER CONTENTS 4.1 Introduction 4.2 Sampling Distributions Associated with Normal Populations 4.3 Order Statistics 4.4 Large Sample Approximations 4.5 Chapter Summary 4.6 Computer Examples Projects for Chapter

4.1 Introduction Sampling distributions play a very important role in statistical analysis and decision making. Because a random sample is a set of random variables X1, . . ., Xn, it follows that a sample statistic that is a function of the sample is also random. We call the probability distribution of a sample statistic its sampling distribution. Sampling distributions provide the link between probability theory and statistical inference. a population distribution v.s. a sampling distribution

 is the standard deviation of the population.
Population: Normal/Non-normal  is the standard deviation of the population.

4.1.1 FINITE POPULATION

4.2 Sampling Distributions Associated with Normal Populations
Population: Normal  X_bar is normal

Now we introduce some distributions that can be derived from a normal distribution.

4.2.1 CHI-SQUARE DISTRIBUTION
The chi-square distribution is a special case of a gamma distribution with  = n/2 and  = 2. n (a positive integer) ： degree of freedom = n 2 = 2n Ref: Gamma distribution

S2 : sample variance

(n-1)S^2 / sigma^2 ~ Ch-sq (n-1)

4.2.2 STUDENT t-DISTRIBUTION
Let the random variables X1, . . ., Xn follow a normal distribution with mean  and variance 2. If  is known, then we know that is N(0,1). If  is not known (as is usually the case), then it is routinely replaced by the sample standard deviation s. If the sample size is large, one could suppose that s  and apply the Central Limit Theorem and obtain that is approximately an N(0,1). If the random sample is small, then the distribution of Is given by the so-called Student t-distribution (or simply t-distribution).

This was originally developed by W. S. Gosset in 1908
This was originally developed by W.S. Gosset in Because his employers, the Guinness brewery, would not permit him to publish this important work in his own name, he used the pseudonym “Student.” Thus, the distribution is known as the Student t-distribution.

In fact, the standard normal distribution provides a good approximation to the t-distribution for sample sizes of 30 or more.

EXAMPLE 4.2.6 A manufacturer of fuses claims that with 20% overload, the fuses will blow in less than 10 minutes on the average. To test this claim, a random sample of 20 of these fuses was subjected to a 20% overload, and the times it took them to blow had the mean of 10.4 minutes and a sample standard deviation of 1.6 minutes. It can be assumed that the data constitute a random sample from a normal population. Do they tend to support or refute the manufacturer’s claim?

EXAMPLE 4.2.7 The human gestation period—the period of time between conception and labor—is approximately 40 weeks (280 days), measured from the first day of the mother’s last menstrual period. For a newborn full-term infant, the length appropriate for gestational age is assumed to be normally distributed with  = 50 centimeters and  = 1.25 centimeters. Compute the probability that a random sample of 20 infants born at full term results in a sample mean greater than 52.5 centimeters.

4.2.3 F-DISTRIBUTION The F-distribution was developed by Fisher to study the behavior of two variances from random samples taken from two independent normal populations. Question of interest： whether the population variances are equal or not, based on the response of the random samples.

4.3 Order Statistics

The extreme (i.e. largest) value distribution

4.4 Large Sample Approximations
If the sample size is large, the normality assumption on the underlying population can be relaxed.

A useful generalization of Corollary 4.2.2:

By Corollary 4.2.2, if the random sample came from a normal population, then sampling distribution of the mean is normally distributed regardless of the size of the sample. By Theorem 4.4.1, regardless of the form of the population distribution, the distribution of the z-transform of a sample mean X will be approximately a standard normal random variable whenever n is large. Even though the required sample size to apply Theorem will depend on the particular distribution of the population, for practical purposes we will consider the sample size to be large enough if n30.

EXAMPLE 4.4.1 The average SAT score for freshmen entering a particular university is 1100 with a standard deviation of 95. What is the probability that the mean SAT score for a random sample of 50 of these freshmen will be anywhere from 1075 to 1110?

4.4.1 THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION
Because Y ?nX, by the Central Limit Theorem, Y has an approximate normal distribution with mean  = n and variance 2= np(1-p). Because the calculation of the binomial probabilities is cumbersome for large sample sizes n, the normal approximation to the binomial distribution is widely used. A useful rule of thumb for use of the normal approximation to the binomial distribution is to make sure n is large enough if np > 5 and n(1-p) > 5. Otherwise, the binomial distribution may be so asymmetric that the normal distribution may not provide a good approximation. Other rules, such as np10 and n(1-p) > 10, or np(1-p) > 10, are also used in the literature. Because all of these rules are only approximations, for consistency’s sake we will use np > 5 and n(1-p) > 5 to test for largeness of sample size in the normal approximation to the binomial distribution. If need arises, we could use the more stringent condition np(1-p) > 10.

CORRECTION FOR CONTINUITY FOR THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

EXAMPLE 4.4.2 A study of parallel interchange ramps revealed that many drivers do not use the entire length of parallel lanes for acceleration, but seek, as soon as possible, a gap in the major stream of traffic to merge. At one site on Interstate Highway 75, 46% of drivers used less than one third of the lane length available before merging. Suppose we monitor the merging pattern of a random sample of 250 drivers at this site. (a) What is the probability that fewer than 120 of the drivers will use less than one third of the acceleration lane length before merging? (b) What is the probability that more than 225 of the drivers will use less than one third of the acceleration lane length before merging?

4.5 Chapter Summary

4.6 Computer Examples (Optional)

Projects for Chapter 4

Sampling Distributions

Similar presentations

Presentation on theme: "Sampling Distributions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sampling Distributions

Similar presentations

Presentation on theme: "Sampling Distributions"— Presentation transcript:

Similar presentations

About project

Feedback