SAMPLING DISTRIBUTION
Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather than investigating the whole population, we take a sample, calculate a statistic related to the parameter of interest, and make an inference.
STATISTIC Let X1, X2,…,Xn be a r.s. of size n from a population and let T(x1,x2,…,xn) be a function which does not depend on any unknown parameters. Then, the r.v. or a random vector Y=T(X1, X2,…,Xn) is called a statistic.
STATISTIC The sample mean is the arithmetic average of the values in a r.s. The sample variance is the statistic defined by The sample standard deviation is the statistic defined by S.
SAMPLING DISTRIBUTION A statistic is also a random variable. Its distribution depends on the distribution of the random sample and the form of the function Y=T(X1, X2,…,Xn). The probability distribution of a statistic Y is called the sampling distribution of Y.
Sampling Distribution of the Mean An example A die is thrown infinitely many times. Let X represent the number of spots showing on any throw. The probability distribution of X is E(X) = 1(1/6) + 2(1/6) + 3(1/6)+ ………………….= 3.5 V(X) = (1-3.5)2(1/6) + (2-3.5)2(1/6) + …………. …= 2.92 x 1 2 3 4 5 6 p(x) 1/6 1/6 1/6 1/6 1/6 1/6
Throwing a die twice – sample mean Suppose we want to estimate the mean of a population m from the mean of a sample, , of size n = 2. What is the distribution of ?
Throwing a die twice – sample mean
The distribution of when n = 2 1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6/36 5/36 4/36 3/36 2/36 1/36 E( ) =1.0(1/36)+ 1.5(2/36)+….=3.5 V( ) = (1.0-3.5)2(1/36)+ (1.5-3.5)2(2/36)... = 1.46
Sampling Distribution of the Mean 6
Sampling Distribution of the Mean Notice that is smaller than sx. The larger the sample size the smaller . Therefore, tends to fall closer to m, as the sample size increases. Notice that is smaller than . The larger the sample size the smaller . Therefore, tends to fall closer to m, as the sample size increases.
SAMPLING FROM THE NORMAL DISTRIBUTION Properties of the Sample Mean and Sample Variance Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then,
SAMPLING FROM THE NORMAL DISTRIBUTION Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then, Most of the time is unknown, so we use:
SAMPLING FROM THE NORMAL DISTRIBUTION In statistical inference, Student’s t distribution is very important.
SAMPLING FROM THE NORMAL DISTRIBUTION Let X1, X2,…,Xn be a r.s. of size n from a N(X,X2) distribution and let Y1,Y2,…,Ym be a r.s. of size m from an independent N(Y,Y2). If we are interested in comparing the variability of the populations, one quantity of interest would be the ratio
SAMPLING FROM THE NORMAL DISTRIBUTION The F distribution allows us to compare these quantities by giving the distribution of If X~Fp,q, then 1/X~Fq,p. If X~tq, then X2~F1,q.
CENTRAL LIMIT THEOREM Random Sample If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution. Sample Mean Distribution X Random Variable (Population) Distribution Random Sample (X1, X2, X3, …,Xn)
Sampling Distribution of the Sample Mean If X is normal, is normal. If X is non-normal, is approximately normally distributed for sample size greater than or equal to 30.
EXAMPLE 1 The amount of soda pop in each bottle is normally distributed with a mean of 32.2 ounces and a standard deviation of 0.3 ounces. Find the probability that a bottle bought by a customer will contain more than 32 ounces. Solution The random variable X is the amount of soda in a bottle. 0.7486 x = 32 m = 32.2
EXAMPLE 1 (contd.) Find the probability that a carton of four bottles will have a mean of more than 32 ounces of soda per bottle. Solution Define the random variable as the mean amount of soda per bottle. 0.9082 m = 32.2 0.7486 x = 32
Sampling Distribution of a Proportion The parameter of interest for nominal data is the proportion of times a particular outcome (success) occurs. To estimate the population proportion ‘p’ we use the sample proportion. The number of successes The estimate of p = p ^ = X n
Sampling Distribution of a Proportion ^ Since X is binomial, probabilities about can be calculated from the binomial distribution. Yet, for inference about we prefer to use normal approximation to the binomial whenever this approximation is appropriate. p ^
Approximate Sampling Distribution of a Sample Proportion From the laws of expected value and variance, it can be shown that E( ) = p and V( )=p(1-p)/n If both np ≥ 5 and n(1-p) ≥ 5, then Z is approximately standard normally distributed.
EXAMPLE A state representative received 52% of the votes in the last election. One year later the representative wanted to study his popularity. If his popularity has not changed, what is the probability that more than half of a sample of 300 voters would vote for him?
EXAMPLE (contd.) Solution The number of respondents who prefer the representative is binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156 and n(1-p) = 300(1-.52) = 144 (both greater than 5)