Sampling Distributions
Parameter A number that describes the population Symbols we will use for parameters include - mean – standard deviation – proportion (Also p is used) – y-intercept of LSRL – slope of LSRL
Statistic A number that can be computed from sample data without making use of any unknown parameter Symbols we will use for statistics include x – mean s – standard deviation p – proportion a – y-intercept of LSRL b – slope of LSRL
Identify the boldface values as parameter or statistic A carload lot of ball bearings has mean diameter cm. This is within the specifications for acceptance of the lot by the purchaser. By chance, an inspector chooses 100 bearings from the lot that have mean diameter cm. Because this is outside the specified limits, the lot is mistakenly rejected.
Why do we take samples instead of taking a census? A census is not always accurate. Census are difficult or impossible to do. Census are very expensive to do.
distribution A distribution is all the values that a variable can be.
sampling distribution The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.
Consider the population – the length of fish (in inches) in my pond - consisting of the values 2, 7, 10, 11, 14 x = 8.8 x = What is the mean and standard deviation of this population? Sampling Distribution of Means
Let’s take samples of size 2 (n = 2) from this population: How many combinations of samples of size 2 are possible? x = 8.8 x = C 2 = 10 Find all 10 of these samples and record the sample means. What is the mean and standard deviation of the sample means?
Repeat this procedure with sample size n = 3 How many samples of size 3 are possible? 5 C 3 = 10 x = 8.8 x = What is the mean and standard deviation of the sample means? Find all of these samples and record the sample means.
What do you notice? The mean of the sampling distribution EQUALS the mean of the population. As the sample size increases, the standard deviation of the sampling distribution decreases. x = as n xx Unbiased Estimator
Activity
unbiased A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated.
Thought to Ponder: In real life, is it possible to create a sampling distribution for every study? How many times can we afford to take a sample?
Rule 1: Rule 2: This rule is approximately correct as long as no more than 10% of the population is included in the sample General Properties for Sample Means x = n x =
General Properties for Sample Means Rule 3: When the population distribution is normal, the sampling distribution of x is also normal for any sample size n.
General Properties for Sample Means Rule 4: Central Limit Theorem When n is sufficiently large, the sampling distribution of x is approximated by a normal curve, even when the population distribution is not itself normal. To apply CLT, n ≥ 30
If n is large or the population distribution is normal, then has approximately a standard normal distribution.
Sampling Distribution of Proportions Activity: Reese’s Pieces This is a proportion activity where the sample proportion (p) of orange pieces to the total is calculated based on a sample that we can vary This is a proportion activity where the sample proportion (p) of orange pieces to the total is calculated based on a sample that we can vary es3/ReesesPieces.html es3/ReesesPieces.html es3/ReesesPieces.html es3/ReesesPieces.html Take multiple samples to see the trends (shape, center and spread) Take multiple samples to see the trends (shape, center and spread) Increase sample size, and repeat Increase sample size, and repeat
What do you notice? The mean of the proportion sampling distribution EQUALS the proportion of the population as number of samples increased. As the sample size increases, the standard deviation of the sampling distribution decreases. as n p = p ( pp
Rule 1: Rule 2: This rule is approximately correct as long as no more than 10% of the population is included in the sample General Properties of Sample Proportions p = p or ( (unbiased estimator) n p = p(1-p)
General Properties for Sample Proportions Rule 3: As n increases, the sampling distribution of p becomes approximately normal. You need to check for the Normal condition before you perform Normal calculations np ≥ 10 and n(1-p) ≥ 10
Review of Sample Restrictions Central Limit Theorem – if n ≥ 30, the sampling distribution of the mean can be assumed to be Normal Normal Condition for sampling distribution of proportions – np ≥ 10 & n(1-p) ≥ 10
Review of Sample Restrictions (cont’d) Sample sizes must be less than 10% of the population – This is because when we sample, we are doing so without replacement which means that the probabilities are changing (Not Independent) but the standard deviation formula assumes same probability each time (Independence). Having our sample be small in comparison to the population allows us to assume independence.
To Solve Sampling Distributions of Proportions State: State the probability ie P(p≥70) Plan: Determine if all sample requirements are met to use Normal Calculations Do: the math Conclude: State the answer in the proper context ie Approximately 90% of all SRS’s of sample size xxx will have a p ≥ 70
EX) The army reports that the distribution of head circumference among soldiers is approximately normal with mean 22.8 inches and standard deviation of 1.1 inches. a) What is the probability that a randomly selected soldier’s head will have a circumference that is greater than 23.5 inches? P(X > 23.5) =.2623
b) What is the probability that a random sample of five soldiers will have an average head circumference that is greater than 23.5 inches? What normal curve are you now working with? Do you expect the probability to be more or less than the answer to part (a)? Explain P(X > 23.5) =.0774
Ex 2) A polling organization asks an SRS of 1500 first year college students how far away their home is. Suppose that 35% of all first-year students (population) actually attend college within 50 miles of home. What is the probability that the random sample of 1500 students will give a result within 2 percentage points of this true value?
State problem: P(0.33≤p≤0.37) Plan: Sample size must be less than 10% of the population – there are st year students – Check!! Can we assume Normal Distribution? Can we assume Normal Distribution? np≥ 10? Check!! N(1-p) ≥10? Check!! np≥ 10? Check!! N(1-p) ≥10? Check!! Do: Normalcdf(0.33,0.37,0.35,0.0123)= Conclude: Approximately 89.6% of all SRSs of size 1500 will give a result within 2% points of the truth of the population. p = p(1-p) n =0.0123
Suppose a team of biologists has been studying the Pinedale children’s fishing pond. This group of biologists has determined that the length has a normal distribution with mean of 10.2 inches and standard deviation of 1.4 inches. Let x represent the length of a single trout taken at random from the pond. What is the probability that a single trout taken at random from the pond is between 8 and 12 inches long? P(8 < X < 12) =.8427
What is the probability that the mean length of five trout taken at random is between 8 and 12 inches long? What sample mean would be at the 95 th percentile? (Assume n = 5) P(8< x <12) =.9978 Do you expect the probability to be more or less than the answer to part (a)? Explain x = inches
A soft-drink bottler claims that, on average, cans contain 12 oz of soda. Let x denote the actual volume of soda in a randomly selected can. Suppose that x is normally distributed with =.16 oz. Sixteen cans are to selected with a mean of 12.1 oz. What is the probability that the average of 16 cans will exceed 12.1 oz? Do you think the bottler’s claim is correct? No, since it is not likely to happen by chance alone & the sample did have this mean, I do not think the claim that the average is 12 oz. is correct. P(x >12.1) =.0062
A hot dog manufacturer asserts that one of its brands of hot dogs has a average fat content of 18 grams per hot dog with standard deviation of 1 gram. Consumers of this brand would probably not be disturbed if the mean was less than 18 grams, but would be unhappy if it exceeded 18 grams. An independent testing organization is asked to analyze a random sample of 36 hot dogs. Suppose the resulting sample mean is 18.4 grams. Does this result indicate that the manufacturer’s claim is incorrect? What if the sample mean was 18.2 grams, would you think the claim was incorrect? Yes, not likely to happen by chance alone. No