Sampling Distributions Day 1 and Day 2 9.1
Many investigations and research projects try to draw conclusions about how the values of some variable x are distributed in a population. Often, attention is focused on a single characteristic of that distribution. Examples include: 1. x = fat content (in grams) of a quarter-pound hamburger, with interest centered on the mean fat content μ of all such hamburgers
2. x = fuel efficiency (in miles per gallon) for a 2003 Honda Accord, with interest focused on the variability in fuel efficiency as described by σ, the standard deviation for the fuel efficiency population distribution 3. x = time to first recurrence of skin cancer for a patient treated using a particular therapy, with attention focused on p, the proportion of such individuals whose first recurrence is within 5 years of the treatment.
Parameter: A number that describes the ____________. This number is typically unknown. Statistic: A number that describes the ________. We use this number to ___________ the ______________.
Population Sample Mean Standard Deviation Proportion Standard deviation of the proportion Parameter Statistic
Is the boldfaced number a parameter of a statistic Is the boldfaced number a parameter of a statistic. Use the proper notation to describe the number. The Bureau of Labor Statistics announces that last month it interviewed all members of the labor force in a sample of 50,000 households; 4.5% of the people interviewed are unemployed. The ball bearing in a large container have mean diameter 1.35 centimeters. This is within the specification for acceptance of the container by the purchaser. By chance, an inspector chooses 100 bearings from the container that have mean diameter 1.37 cm.
Sampling Distribution: The distribution of all values taken by the statistic in _____ ___________ ______of the ________ ________from the _____________ _____________
Sampling Variability: The ______________ between each ____________ of samples of the ___________size. If I compare many different samples and the statistic is very __________ in each one, then the ___________ _____________ is _______. If I compare many different samples and the statistic is very ___________ in each one, then the ______________ ____________is ___________.
Unbiased: When the statistic is __________to the _________ value of the parameter Unbiased Estimator: The unbiased _____________
________ bias, ________ variability Look at the following four histograms. (Use the bull’s-eye example on for reference.) In these histograms, what represents variability? In these histograms, what represents bias? a) High bias, high variability ________ bias, ________ variability
________ bias, ________ variability Low bias and low variablility Low bias, high variablility ________ bias, ________ variability
________ bias, ________ variability d) High bias, low variability ________ bias, ________ variability
How sampling works: Take a _______ number of samples from the ________ population. 2. Calculate the sample _________ or sample _______________ for each sample 3. Make a _______________ of the values of the statistics 4. Examine the ________________
Facts about Samples: If I chose a different sample, it would still represent the same population. A different sample ________ _________produces different _____________. A statistic can be _____________ and still have high ______________. To avoid this, ___________ the size of the sample. ___________ samples give smaller spread.
Example #1: Classify each underlined number as a parameter or statistic. Give the appropriate notation for each. a. Forty-two percent of today’s 15-year-old girls will get pregnant in their teens.
Example #1: Classify each underlined number as a parameter or statistic. Give the appropriate notation for each. b. The National Center for Health Statistics reports that the mean systolic blood pressure for males 35 to 44 years of age is 128 and the standard deviation is 15. The medical director of a large company looks at the medical records of 72 executives in this age group and finds that the mean systolic blood pressure for these executives is 126.07.
Example #2: Suppose you have a population in which 60% of the people approve of gambling.
You want to take many samples of size 10 from this population to observe how the sample proportion who approve of gambling vary in repeated samples. b. Describe the design of a simulation using the partial random digits table below to estimate the sample proportion who approve of gambling. Label how you will conduct the simulation. Then carry out five trials of your simulation. What is the average of the samples? How close is it to the 60%?
c. The sampling distribution of is the distribution of from all possible SRSs of size 10 from this population. What would be the mean of this distribution if this process was repeated 100 times?
d. If you used samples of size 20 instead of size 10, which sampling distribution would give you a better estimate of the true proportion of people who approve of gambling? Explain your answer. e. Make a histogram of the sample distribution. Describe the graph.
Sampling Proportions
Sampling Distribution of If our sample is an SRS of size n, then the following statements describe the sampling model for : (2) The standard deviation is Srs = simple random sample
Sampling Distribution of a Sample Proportion: (1) The mean is exactly _________. (2) The standard deviation is
Rule of Thumb #1: You can only use if the population is ________ the ___________ __________. A census should be impractical! when N 10n
Rule of Thumb #2: Only use the _____________ approximation of the sampling distribution of when: and
Conclusion: If p is the population proportion then, If is the sample proportion then, and ONLY if
So, to calculate a Z-score for this!
Where should the distribution of the 60 -values be centered? Example #1 Suppose you are going to roll a fair six-sided die 60 times and record , the proportion of times that a 1 or a 2 is showing. Where should the distribution of the 60 -values be centered? b. What is the standard deviation of the sampling distribution of , the proportion of all rolls of the die that show a 1 or a 2 out of the 60 rolls?
c. Describe the shape of the sampling distribution of Justify your answer.
Example #2 According to government data, 22% of American children under the age of 6 live in households with incomes less than the official poverty level. A study of learning in early childhood chooses an SRS of 300 children. What is the probability that more than 20% of the sample are from poverty households?
b. How large a sample would be needed to guarantee that the standard deviation of is no more than 0.01? Explain.