Sampling Distributions

Slides:



Advertisements
Similar presentations
Chapter 6 Sampling and Sampling Distributions
Advertisements

Terminology A statistic is a number calculated from a sample of data. For each different sample, the value of the statistic is a uniquely determined number.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 7 Introduction to Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
Sampling Distributions
Part III: Inference Topic 6 Sampling and Sampling Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Chapter 7 ~ Sample Variability
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
AP Statistics Chapter 9 Notes.
Chap 6-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 6 Introduction to Sampling.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Chapter 7 Statistical Inference: Estimating a Population Mean.
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
 The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )
1 CHAPTER 6 Sampling Distributions Homework: 1abcd,3acd,9,15,19,25,29,33,43 Sec 6.0: Introduction Parameter The "unknown" numerical values that is used.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Introduction to Inference Sampling Distributions.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Chapter 9 Day 2. Warm-up  If students picked numbers completely at random from the numbers 1 to 20, the proportion of times that the number 7 would be.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
Sampling Distribution of the Sample Mean
Chapter 6 The Normal Distribution and Other Continuous Distributions
Sampling Distributions
Sampling Distributions Chapter 18
Where Are You? Children Adults.
Sampling and Sampling Distributions
Chapter 6: Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
Chapter 8: Estimating with Confidence
Sampling Distributions
Normal Distribution and Parameter Estimation
Normal Probability Distributions
Chapter 7 Sampling and Sampling Distributions
Sampling Distributions and Estimation
Chapter 6: Sampling Distributions
Sampling Distributions
Sampling Distributions
Chapter 5 Sampling Distributions
Chapter 8: Estimating with Confidence
Chapter 7 Sampling Distributions.
Sampling Distribution of a Sample Proportion
Chapter 5 Sampling Distributions
Continuous Probability Distributions
Descriptive Statistics: Numerical Methods
Chapter 7 Sampling Distributions
Chapter 7 Sampling Distributions.
Chapter 9 Hypothesis Testing.
Chapter 5 Sampling Distributions
Warmup To check the accuracy of a scale, a weight is weighed repeatedly. The scale readings are normally distributed with a standard deviation of
Chapter 7 Sampling Distributions.
Chapter 8: Estimating with Confidence
CHAPTER 15 SUMMARY Chapter Specifics
Sampling Distributions
Chapter 8 Confidence Intervals.
Chapter 7 Sampling Distributions.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Introduction to Sampling Distributions
Advanced Algebra Unit 1 Vocabulary
Chapter 7 Sampling Distributions.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 5: Sampling Distributions
Presentation transcript:

Sampling Distributions Chapter 7 Sampling Distributions

Sampling Distributions 7.1 The Sampling Distribution of the Sample Mean 7.2 The Sampling Distribution of the Sample Proportion

Sampling Distribution of the Sample Mean The sampling distribution of the sample mean x is the probability distribution of the population of the sample means obtainable from all possible samples of size n from a population of size N

Example 7.1: The Risk Analysis Case Population of the percent returns from six grand prizes In order, the values of prizes (in $,000) are 10, 20, 30, 40, 50 and 60 Label each prize A, B, C, …, F in order of increasing value The mean prize is $35,000 with a standard deviation of $17,078

Example 7.1: The Risk Analysis Case Continued Any one prize of these prizes is as likely to be picked as any other of the six Uniform distribution with N = 6 Each stock has a probability of being picked of 1/6 = 0.1667

Example 7.1: The Risk Analysis Case #3

Example 7.1: The Risk Analysis Case #4 Now, select all possible samples of size n = 2 from this population of size N = 6 Select all possible pairs of prizes How to select? Sample randomly Sample without replacement Sample without regard to order

Example 7.1: The Risk Analysis Case #5 Result: There are 15 possible samples of size n = 2 Calculate the sample mean of each and every sample

Example 7.1: The Risk Analysis Case #6

Observations Although the population of N = 6 grand prizes has a uniform distribution, … … the histogram of n = 15 sample mean prizes: Seems to be centered over the sample mean return of 35,000, and Appears to be bell-shaped and less spread out than the histogram of individual returns

Example: Sampling All Stocks Population of returns of all 1,815 stocks listed on NYSE for 1987 See Figure on next slide The mean rate of return m was –3.5% with a standard deviation s of 26% Draw all possible random samples of size n=5 and calculate the sample mean return of each Sample with a computer

Example: Sampling All Stocks

Results from Sampling All Stocks Observations Both histograms appear to be bell-shaped and centered over the same mean of –3.5% The histogram of the sample mean returns looks less spread out than that of the individual returns Statistics Mean of all sample means: µx = µ = -3.5% Standard deviation of all possible means:

General Conclusions If the population of individual items is normal, then the population of all sample means is also normal Even if the population of individual items is not normal, there are circumstances when the population of all sample means is normal (Central Limit Theorem)

General Conclusions Continued The mean of all possible sample means equals the population mean That is, m = mx The standard deviation sx of all sample means is less than the standard deviation of the population That is, sx < s Each sample mean averages out the high and the low measurements, and so are closer to m than many of the individual population measurements

The Empirical Rule The empirical rule holds for the sampling distribution of the sample mean 68.26% of all possible sample means are within (plus or minus) one standard deviation sx of m 95.44% of all possible observed values of x are within (plus or minus) two sx of m 99.73% of all possible observed values of x are within (plus or minus) three sx of m

Properties of the Sampling Distribution of the Sample Mean If the population being sampled is normal, then so is the sampling distribution of the sample mean, x The mean sx of the sampling distribution of x is mx = m That is, the mean of all possible sample means is the same as the population mean

Properties of the Sampling Distribution of the Sample Mean #2 The variance s2x of the sampling distribution of x is That is, the variance of the sampling distribution of x is Directly proportional to the variance of the population Inversely proportional to the sample size

Properties of the Sampling Distribution of the Sample Mean #3 The standard deviation sx of the sampling distribution of x is That is, the standard deviation of the sampling distribution of x is Directly proportional to the standard deviation of the population Inversely proportional to the square root of the sample size

Notes The formulas for s2x and sx hold if the sampled population is infinite The formulas hold approximately if the sampled population is finite but if N is much larger (at least 20 times larger) than the n (N/n ≥ 20) x is the point estimate of m, and the larger the sample size n, the more accurate the estimate Because as n increases, sx decreases as 1/√n Additionally, as n increases, the more representative is the sample of the population So, to reduce sx, take bigger samples!

Example 7.2: Car Mileage Case Population of all midsize cars of a particular make and model Population is normal with mean µ and standard deviation σ Draw all possible samples of size n Then the sampling distribution of the sample mean is normal with mean µx = µ and standard deviation In particular, draw samples of size: n = 5 n = 50

Effect of Sample Size So, the different possible sample means will be more closely clustered around m if n = 50 than if n =5

Example 7.2: Car Mileage Statistical Inference Recall from Chapter 3 mileage example, x = 31.56 mpg for a sample of size n=50 and s = 0.8 If the population mean µ is exactly 31, what is the probability of observing a sample mean that is greater than or equal to 31.56?

Example 7.2: Car Mileage Statistical Inference #2 Calculate the probability of observing a sample mean that is greater than or equal to 31.6 mpg if µ = 31 mpg Want P(x > 31.5531 if µ = 31) Use s as the point estimate for s so that

Example 7.2: Car Mileage Statistical Inference #3 Then But z = 4.96 is off the standard normal table The largest z value in the table is 3.99, which has a right hand tail area of 0.00003

Probability That x > 31.56 When µ = 31

Example 7.2: Car Mileage Statistical Inference #4 z = 4.96 > 3.99, so P(z ≥ 4.84) < 0.00003 If m = 31 mpg, fewer than 3 in 100,000 of all samples have a mean at least as large as observed Have either of the following explanations: If m is actually 31 mpg, then very unlucky in picking this sample Not unlucky, m is not 31 mpg, but is really larger Difficult to believe such a small chance would occur, so conclude that there is strong evidence that m does not equal 31 mpg Also, m is, in fact, actually larger than 31 mpg

Central Limit Theorem Now consider sampling a non-normal population Still have: mx=m and sx=s/n Exactly correct if infinite population Approximately correct if population size N finite but much larger than sample size n But if population is non-normal, what is the shape of the sampling distribution of the sample mean? The sampling distribution is approximately normal if the sample is large enough, even if the population is non-normal (Central Limit Theorem)

The Central Limit Theorem #2 No matter what is the probability distribution that describes the population, if the sample size n is large enough, then the population of all possible sample means is approximately normal with mean mx=m and standard deviation sx=s/n Further, the larger the sample size n, the closer the sampling distribution of the sample mean is to being normal In other words, the larger n, the better the approximation

The Central Limit Theorem #3 Random Sample (x1, x2, …, xn) as n  large Sampling Distribution of Sample Mean (nearly normal) Population Distribution (m, s) (right-skewed) X

Example: Effect of Sample Size The larger the sample size, the more nearly normally distributed is the population of all possible sample means Also, as the sample size increases, the spread of the sampling distribution decreases

How Large? How large is “large enough?” If the sample size is at least 30, then for most sampled populations, the sampling distribution of sample means is approximately normal Here, if n is at least 30, it will be assumed that the sampling distribution of x is approximately normal If the population is normal, then the sampling distribution of x is normal regardless of the sample size

Example: Central Limit Theorem Simulation

Unbiased Estimates A sample statistic is an unbiased point estimate of a population parameter if the mean of all possible values of the sample statistic equals the population parameter x is an unbiased estimate of m because mx=m In general, the sample mean is always an unbiased estimate of m The sample median is often an unbiased estimate of m but not always

Unbiased Estimates Continued The sample variance s2 is an unbiased estimate of s2 That is why s2 has a divisor of n – 1 and not n However, s is not an unbiased estimate of s Even so, the usual practice is to use s as an estimate of s

Minimum Variance Estimates Want the sample statistic to have a small standard deviation All values of the sample statistic should be clustered around the population parameter Then, the statistic from any sample should be close to the population parameter

Minimum Variance Estimates Continued Given a choice between unbiased estimates, choose one with smallest standard deviation The sample mean and the sample median are both unbiased estimates of m The sampling distribution of sample means generally has a smaller standard deviation than that of sample medians

Finite Populations If a finite population of size N is sampled randomly and without replacement, must use the “finite population correction” to calculate the correct standard deviation of the sampling distribution of the sample mean If N is less than 20 times the sample size, that is, if N < 20  n Otherwise

Finite Populations Continued The finite population correction is The standard error is

Stock Example In the example of sampling n = 2 stock percent returns from N = 6 stocks Here, N = 3 • n, so to calculate σx, must use the finite population correction Note that the finite population correction is substantially less than one

Sampling Distribution of the Sample Proportion The probability distribution of all possible sample proportions is the sampling distribution of the sample proportion If a random sample of size n is taken from a population then the sampling distribution of the sample proportion is Approximately normal, if n is large Has a mean that equals ρ Has standard deviation Where ρ is the population proportion and p̂ is the sampled proportion

Example 7.4: The Cheese Spread Case Must decide the proportion of customers who would stop buying the spread if a new spout is used is less than 10 percent Randomly selected 1,000 customers Of those, 63 said they would stop p̂ = 0.063 which is less than 10 percent

Example 7.4: The Cheese Spread Case Continued If ρ=0.10, we can assume the sampling distribution of p̂ is normally distributed nρ = 1,000(.10) = 100 > 5 n(1-ρ) = 1,000(.90) = 900 > 5

Example 7.4: The Cheese Spread Case #3 If we believe ρ=0.10, we must believe that we have observed a sample proportion that can be described as a 5 in 100,000 chance It follows that we have extremely strong evidence that ρ does not equal 0.10 and is less than 0.10