Sampling Distributions

Slides:



Advertisements
Similar presentations
Chapter 6 Introduction to Sampling Distributions
Advertisements

Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Sampling Distributions
1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.
AP Statistics Chapter 9 Notes.
Rule of sample proportions IF:1.There is a population proportion of interest 2.We have a random sample from the population 3.The sample is large enough.
Distributions of the Sample Mean
Sampling Theory Determining the distribution of Sample statistics.
AP STATS: WARM UP I think that we might need a bit more work with graphing a SAMPLING DISTRIBUTION. 1.) Roll your dice twice (i.e. the sample size is 2,
Chapter 9: Means and Proportions as Random Variables 9.1 Understanding dissimilarity among samples 9.2 Sampling distributions for sample proportions 9.3.
Sampling Distributions
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Unit 5 – Chapters 10 and 12 What happens if we don’t know the values of population parameters like and ? Can we estimate their values somehow?
Chapter 10: Comparing Two Populations or Groups
Chapter 4 Section 4.4 Sampling Distribution Models
Sampling Variability & Sampling Distributions
Sampling Distributions
Target for Today Know what can go wrong with a survey and simulation
The Diversity of Samples from the Same Population
What does a population that is normally distributed look like?
AP Stats Check In Where we’ve been…
Introduction to Summary Statistics
8.1 Sampling Distributions
Chapter 5 Sampling Distributions
Section 4.4 Sampling Distribution Models and the Central Limit Theorem
Chapter 10: Comparing Two Populations or Groups
Sec. 2.1 Review 10th, z = th, z = 1.43.
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
CONCEPTS OF ESTIMATION
Introduction to Summary Statistics
MATH 2311 Section 4.4.
Chapter 5 Sampling Distributions

Chapter 9.1: Sampling Distributions
Chapter 10: Estimating with Confidence
Click the mouse button or press the Space Bar to display the answers.
Keller: Stats for Mgmt & Econ, 7th Ed Sampling Distributions
Sampling Distributions
Sampling Distributions
Chapter 8: Estimating with Confidence
Sampling Distributions of Proportions section 7.2
Chapter 8: Estimating with Confidence
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Section 9.2 Sampling Proportions
Chapter 8: Estimating with Confidence
Chapter 10: Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
Chapter 10: Comparing Two Populations or Groups
Comparing Two Proportions
Chapter 10: Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
Sampling Distributions
Warmup Which of the distributions is an unbiased estimator?
Comparing Two Proportions
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Basic Practice of Statistics - 3rd Edition The Normal Distributions
How Confident Are You?.
Presentation transcript:

Sampling Distributions

Statistics VS parameters Statistic – is a numerical value computed from a sample. Parameter – is a numerical value associated with a population. Essentially, we would like to know the parameter. But in most cases it is hard to know the parameter since the population is too large. So we have to estimate the parameter by some proper statistics computed from the sample.

Quick Review p = population proportion = sample proportion (it is called p-hat) μ = population mean = sample mean Empirical rule: For Variables with a Normal (Bell-Shaped Distribution) ~68% of the values fall within +/- 1 standard deviation of the mean. ~95% of the values fall within +/-2 standard deviations of the mean.

Sampling Distribution of the Sample Proportion Situation 1: A survey is undertaken to determine the proportion of PSU students who engage in under-age drinking. The survey asks 200 random under-age students (assume no problems with bias). Suppose the true population proportion of those who drink is 60% or p=.6 is the proportion in the sample who drink.

Repeated Samples Imagine repeating this survey many times, and each time we record the sample proportion of those who have engaged in under-age drinking. What would the sampling distribution of look like? Sample (n=200) Sample Proportion 1 2 3 4 5 … 150,000 is a random variable assigning a value to each sample!

Histogram of for 150k samples.

Sampling Distribution of Derived from the Binomial Distribution Let X be the number of respondents who say they engage in under age drinking. What is the PDF of X? X is binomial with n=200 and p=.6 so we can calculate the probability of X for each possible outcome (0-200). The PDF is plotted below:

Sampling Dist. of Since the is simply X/n it follows that the sampling distribution of is the same as that of the binomial distribution divided by n.

Normal Approximation for Sample Proportions The sampling distribution of is approximately normal with mean p and standard deviation if the following conditions are satisfied: A random sample is selected from the population. Even if the sample is not perfectly random, as long as it is free from bias it will be okay. Sample must be large enough, np and n(1-p) MUST be greater than 5, and should be greater than 10.

Example: Problem 9.11 Recent studies have shown that about 20% of American adults fit the medical definition of being obese. A large medical clinic would like to estimate what percent of their patients are obese, so they take a random sample of 100 patients and find that 18 percent are obese. Suppose in truth, the same percentage holds for the patients of the medical clinic as for the general population, 20%. Give a numerical value of each of the following….

Problem 9.11 Cont. The population proportion of obese patients in the medical clinic, p = .2 The proportion of obese patients in the sample of 100 patients, = 18/100 = 0.18 The standard error of , = 0.0384 The mean of the sampling distribution of = p = .2 The standard deviation of the sampling distribution of , = .04

Sampling Distribution of the Sample Mean Situation 2: The mean height of women age 20 to 30 is normally distributed (bell-shaped) with a mean of 65 inches and a standard deviation of 3 inches. A random sample of 200 women was taken and the sample mean recorded. Now IMAGINE taking MANY samples of size 200 from the population of women. For each sample we record the . What is the sampling distribution of ?

Histograms for the Distribution of X and X-Bar Original Population of Women: X= height of random woman Distribution of Sample Means: X-bar = mean of random sample of size 200.

For Normal Data: Consider a random variable X with mean μ and standard deviation σ. The sampling distribution of the sample mean for sample of size n, is normal with… What about for skewed or non-normal data?

CD Data from the Class Survey Situation 3: Clearly CDs is a right skewed data set. Suppose our population looked something like this, let us take repeated samples from this population and see what the sample mean looks like.

Suppose we take repeated samples of size, 4, 8, 16, 32 n = 4 n = 8 n = 16 n = 32

Statistics From Skewed Data Using that CD sample as the population, µ = 87.6, σ = 87.8 The sample means from the previous slide had the following summary statistics: Sample Size Mean Std. Deviation N = 4 86.6 43.2 N = 8 86.8 30.9 N = 16 86.7 21.9 N = 32 15.6 Note: that the mean remains constant, and the std. deviation decreases as the sample size increases!

Conclusions and Conditions for the Sample Mean For non-normal data the sampling distribution of the sample mean is approximately normal with mean μ and standard deviation σ/ Conditions! The above is true if the sample size is large enough, usually n greater than 30 is sufficient.

What next? We have shown that both the sampling distribution of the sample proportion, and the sampling distribution of the sample mean are both normal under certain conditions. Now we can use what we know about normal distributions to draw conclusions about and ! Situation 4, demonstrates how to use the sampling distribution of p-hat to draw conclusions.

ASSUME the drug has NOT lost potency, answer the following questions… Situation 4: A certain antibiotic in known to cure 85% of strep bacteria infections. A scientist wants to make sure the drug does not lose its potency over time. He treats 100 strep patients with a 1 year old supply of the antibiotic. Let be the proportion of individuals who are cured. ASSUME the drug has NOT lost potency, answer the following questions… What is the sampling distribution of ? If we repeated this study many times we would expect 95% of to fall within what interval? What is the probability that more than 90% in the sample are cured? Suppose the scientist observed a cure rate of only 75%, would he be justified in concluding the 1 year old drug is less effective?

1. What is the sampling distribution of ? Since both np = 85 and n(1-p) = 15 are greater than 10, and if we assume the sample is random/representative…. Then the sampling distribution of is approximately normal with mean p=.85 and standard deviation = .036.

2. If we repeated this study many times we would expect 95% of to fall within what interval? The empirical rule states that for a normally distributed variable ~95% of the values fall within +/- 2 standard deviations of the mean. So 95% of the should fall within .85+/- 2*.036 or there is 95% probability that the proportion cured should be between 78% and 92%

3. What is the probability that more than 90% in the sample are cured? In other words what is P( >.9)? First calculate a z-score… Z-score = [value-mean]/StdDev Z-score = [.9-.85]/.036 =1.4 P( >.9) = P(Z>1.4 ) = 1- P(Z<1.4 ) = 1-.9192 = .0808

4. Suppose the scientist observed a cure rate of only 75%, would he be justified in concluding the 1 year old drug is less effective? In other words, assuming the cure rate is actually 85%, what is the chance he would observe as sample proportion equal or less than 75%? What is P( .75)? Z-score = [.75-.85]/.036 = -2.80 P( .75) = P(Z< -2.80) = .0026 We will see some examples about how to use the sampling distribution of the sample mean in class activities…but it is similar idea.