Download presentation
Presentation is loading. Please wait.
Published byFay Webster Modified over 8 years ago
1
Sampling and Sampling Distributions
2
Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to sample. Sample statistics are computed from random variables from a population and, as such are random variables themselves. A sampling distribution is simply a probability distribution of a sample statistic.
3
Sampling Distributions Generally we do not know the mean or variance of a random variable; and Often the purpose of sampling is to estimate parameters (mean, variance, etc.) of a population. We use samples because: –The population is too large for a census; –It is too expensive to conduct a census; and/or –The units must be destroyed in order to test the variable(s) of interest, i.e. destructive testing.
4
Definitions A parameter is a numerical descriptive measure of a population. It is calculated from the observations in the population. A sample statistic is a numerical descriptive measure of a sample. It is calculated from the observations in the sample.
5
Sample Statistics Sample mean (used to estimate the population mean - a parameter); Sample median; Sample variance (used to estimate the population variance - another parameter); Sample standard deviation (derived from the sample variance and used to estimate the population variance - another parameter).
6
Example We want to estimate the population mean: –Two possible sample statistics Sample mean - Sample median - –Which one should be used? For example, toss a die three times and let x be the number of dots showing on the up face. Suppose we have 2, 2, and 6 come up: Expected value (of the population) is: Mean of x is: While median is: Which is closer to the true mean (expected value)?
7
Example, cont. –What if we had sample measurements of 3, 4, and 6? Expected value (of the population) is still: Mean of x is: While median is: Now which is closer to the true mean (expected value)?
8
Sampling Statistics Since sampling statistics are random variables, they must be compared on the basis of their probability distributions - the collection of values and associated probabilities of each statistic that would be obtained if the sampling experiment were repeated a very large number of times.
9
Definitions The sampling distribution for a sample statistic (calculated from a sample of n measurements) is the probability distribution for the statistic; or The sampling distribution is a function that gives the probability of every possible value of a sample statistic for specified population and sample size.
10
More Definitions A point estimator of a population parameter is a rule or formula that tells us how to use the sample data to create a single number that can be used as an estimate of the population parameter. If a sample statistic has a sampling distribution with a mean equal to the population parameter the statistic is intended to estimate, the statistic is said to be an unbiased estimator of the parameter.
11
And More Definitions If the mean of the sampling distribution is not equal to the parameter, the statistic is said to be a biased estimator of the parameter.
12
Sampling Distribution of the Sample Mean Often we are interested in making an inference about the mean of some population,. The sample mean is a good choice as the estimator for.
13
Point Estimates S estimates estimates 13
14
Variability among Samples 23.5 mpg 27.5 mpg 23 24 25 26 27 28 29 14
15
Normal Distribution for the Mean Confidence intervals assume that the sample means are normally distributed. Useful Distribution Revisited 15
16
The Mean and Standard Deviation of Sampling Distribution of x Regardless of the shape of the population relative frequency distribution: –The mean of the sampling distribution of will equal, the mean of the sampled population. –The standard deviation of the sampling distribution of will equal, the standard deviation of the sampled population divided by the square root of the sample size n: (often referred to as the standard error of the mean)
17
17 Standard Error of the Mean A statistic that measures the variability of your estimate is the standard error of the mean. It differs from the sample standard deviation because the sample standard deviation is a measure of the variability of data the standard error of the mean is a measure of the variability of sample means. Standard error of the mean = =
18
Example Let x be a normally distributed random variable with a mean of 89 and a standard deviation of 12: –What is the probability that the mean of a sample of size n=19 will be between 85 and 93? –What is the probability that the mean of a sample of size n=40 will exceed 91?
19
Answer to First Part
20
Answer to Second Part
21
Example The population of orders for printing jobs at a print shop is approximately normal with a mean of 200 pages and a standard deviation of 40 pages. The shop is almost out of paper and it has five orders that must be finished before a shipment of paper can be expected. If the shop has 1,200 sheets of paper left, what is the probability that the five orders will not exhaust the stock of paper? Hint: Find
22
Answer
23
Example Let x be a random variable with a mean of 1,200 and a standard deviation of 20: –What is the probability that the mean of a sample of size 80 will exceed 1,202? –What is the probability that the mean of a sample of size 50 will be less than 1,202? –If the probability that the mean of a sample of size n will exceed 1,201 is 0.25, what must n equal?
24
Answers Part 1 - 0.1867 Part 2 - 0.7611 Part 3 - 180
25
Central Limit Theorem If a random sample of n observations is selected from a population, when n is sufficiently large, the sampling distribution of will be approximately a normal distribution. Typically, a sample size of is considered large enough. The larger the sample size n, the better the normal approximation.
26
Normality and the Central Limit Theorem To satisfy the assumption of normality, you can do one of the following: verify that the population distribution is approximately normal apply the central limit theorem The central limit theorem states that the distribution of sample means is approximately normal, regardless of the population distribution’s shape, if the sample size is large enough. “Large enough” is usually approximately 30 observations. It is more if the data are heavily skewed, and fewer if the data are symmetric. 26
27
Central Limit Theorem, Illustrated 27
28
Sampling Distribution of the Proportion We are often interested in making an inference about the proportion of some population, p. Examples: –Proportion of freshman that graduate from Virginia Tech in four years. –Proportion of defective items in a lot. –Proportion of a set of loans that will become nonperforming.
29
The Sample Proportion and Standard Deviation of the Number of Successes The sample proportion p is the value of the random variable x divided by the sample size. The standard deviation of the sampling distribution is:
30
Normal Approximation to the Sampling Distribution of the Proportion Rules: Z-value for sampling distribution for p:
31
Example If a sample of size 100 is taken from a population of size 1000 and the population contains 300 successes: –What is the probability that the sample proportion of successes will be 0.35 or more? –What is the probability that the sample proportion of successes will be between 0.25 and 0.45?
32
Answers Part a: Part b:
33
Example An advertising campaign for a new perfume has a goal of reaching 50% of the women in the target group. Suppose a national sample of 300 women from the target group is drawn to see how the campaign in working. 129 women in the group can recall seeing an ad or commercial for the new perfume. If the population proportion was 0.50, what is the probability of observing a sample proportion of 0.43 or less in a sample of 300?
34
Answer
35
From Here To Inference The primary function of getting a sampling distribution is to produce a statistical inference. Probability distributions allow us to make probability statements about values of a random variable. Thus, knowledge of the population and its parameters allows us to use the probability distribution to make probability statements about individual members of the population.
36
From Here To Inference (cont.) With sampling distributions, knowledge of the parameters and some information about the distribution allow us to make probability statements about a sample statistic. In applying both probability distributions and sampling distributions, we must know the value of relevant parameters, a highly unlikely circumstance. In the real world, parameters are almost always unknown because they represent descriptive measurements about extremely large populations. Statistical inference addresses this problem—now we will assume that most population parameters are unknown.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.