1 The (“Sampling”) Distribution for the Sample Mean*

Slides:



Advertisements
Similar presentations
THE CENTRAL LIMIT THEOREM
Advertisements

Sampling distributions
Chapter 6 Sampling and Sampling Distributions
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Sampling Distributions
THE CENTRAL LIMIT THEOREM The “World is Normal” Theorem.
Copyright © 2010 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Central Limit Theorem.
Chapter 7 Introduction to Sampling Distributions
Chapter 7 Introduction to Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Introduction to Statistics Chapter 7 Sampling Distributions.
Sampling Distributions
Part III: Inference Topic 6 Sampling and Sampling Distributions
6-5 The Central Limit Theorem
Sample Distribution Models for Means and Proportions
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
Review of normal distribution. Exercise Solution.
Objectives (BPS chapter 11) Sampling distributions  Parameter versus statistic  The law of large numbers  What is a sampling distribution?  The sampling.
Sampling distributions BPS chapter 11 © 2006 W. H. Freeman and Company.
Chapter 9 Sampling Distributions and the Normal Model © 2010 Pearson Education 1.
Copyright © 2010 Pearson Education, Inc. Slide
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
AP Statistics Chapter 9 Notes.
STA291 Statistical Methods Lecture 16. Lecture 15 Review Assume that a school district has 10,000 6th graders. In this district, the average weight of.
Continuous Probability Distributions Continuous random variable –Values from interval of numbers –Absence of gaps Continuous probability distribution –Distribution.
Chap 6-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 6 Introduction to Sampling.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 18: Sampling Distribution Models AP Statistics Unit 5.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Chap 7-1 Developing a Sampling Distribution Assume there is a population … Population size N=4.
Distributions of the Sample Mean
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
BUS304 – Chapter 6 Sample mean1 Chapter 6 Sample mean  In statistics, we are often interested in finding the population mean (µ):  Average Household.
1 Chapter 8 Sampling Distributions of a Sample Mean Section 2.
Sampling Distribution Models Chapter 18. Toss a penny 20 times and record the number of heads. Calculate the proportion of heads & mark it on the dot.
LSSG Black Belt Training Estimation: Central Limit Theorem and Confidence Intervals.
1 Chapter 4 Numerical Methods for Describing Data.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
1 CHAPTER 6 Sampling Distributions Homework: 1abcd,3acd,9,15,19,25,29,33,43 Sec 6.0: Introduction Parameter The "unknown" numerical values that is used.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Sampling Theory and Some Important Sampling Distributions.
Chapter 13 Sampling distributions
Chapter 18 Sampling Distribution Models *For Means.
Lecture 5 Introduction to Sampling Distributions.
INFERENTIAL STATISTICS DOING STATS WITH CONFIDENCE.
Parameter versus statistic  Sample: the part of the population we actually examine and for which we do have data.  A statistic is a number summarizing.
Sampling Distributions Chapter 18. Sampling Distributions If we could take every possible sample of the same size (n) from a population, we would create.
Sampling Distribution Models and the Central Limit Theorem Transition from Data Analysis and Probability to Statistics.
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
Chapter 6 Sampling and Sampling Distributions
And distribution of sample means
Sampling Distributions
Parameter versus statistic
Chapter 6: Sampling Distributions
Chapter 7 Sampling Distributions.
Chapter 7 Sampling Distributions.
Sampling Distribution Models
Chapter 7 Sampling Distributions.
Sampling Distributions
Chapter 7 Sampling Distributions.
Sampling Distribution Models
Chapter 7 Sampling Distributions.
Presentation transcript:

1 The (“Sampling”) Distribution for the Sample Mean*

2 Distribution of Sample Means A quantitative population of N units with parameters mean  standard deviation  A random sample of n units from the population Statistic: The sample mean.

3 Distribution of Sample Means Statistic: The sample mean. This statistic is an unbiased point estimate (on average correct) of the parameter .

4 20 Times Rule / 5% Rule (same thing) If the population size (N) is at least 20 times the sample size (n) N / n  20orn / N  0.05 then the standard deviation is (essentially)

5 Distribution of the Sample Mean Given A variable with population that is not Normally distributed with mean  and standard deviation . A random sample of size n. Result The sample mean has approximate Normal distribution with When the population size is at least 20 times n.

6 Example Rolls of paper leave a factory with weights that are Normal with mean  = 1493 lbs, and standard deviation  = 12 lbs.

7 Finding probabilities What is the probability a roll weighs over 1500 lbs? ANS: (about 28% of rolls exceed 1500 lbs)

8 New Question A truck transports 8 rolls at a time. The legal weight limit for the truck is 12,000 lbs. What is the probability 8 rolls have total weight exceeding this limit? Since 12000/8 = 1500, the question could also be phrased: What is the probability 8 rolls have (sample) mean weight exceeding 1500? The bad news: The answer is not The good news: It’s not that tough.

9 Distribution of the Sample Mean Given A variable with population that is Normally distributed with mean  and standard deviation . A random sample of size n. (N/n  20) Result The sample mean has Normal distribution: Review of previous slide

10 Example - continued Rolls (single rolls) of paper leave a factory with weights that are Normal with mean  = 1493 lbs, and standard deviation  = 12 lbs. If n = 8 rolls are randomly selected, what is the probability their sample mean weight exceeds 1500? The distribution of sample means is Normal.

11 Finding probabilities Find the probability the sample mean is over 1500 lbs. Here we’re using the same mean, but a standard deviation reduced to ANS:

12 Interpreting the Result The probability the sample mean for 8 rolls exceeds 1500 lbs is For 4.95% of all possible samples of 8 rolls, the sample mean exceeds 1500 lbs. Equivalent: There is a probability that the total weight will exceed 8  1500 = 12,000 lbs. We’re working towards using the sample mean as an estimate of the population mean.

13 The Picture Weights of single rolls. Sample mean weights for samples of 8 rolls.

14 The Picture About 28% of all rolls are > 1500 lbs

15 The Picture About 5% of all samples of 8 rolls have mean > 1500 lbs

16 Example Survival times have a right skewed distribution with mean  = 13 months and standard deviation  = 12 months. What can we say about the distribution of sample mean survival times for samples of n patients? As n gets larger, the distribution gets closer to Normal.

17 Single values SD = 12.0 Sample mean n = 4 SD = 6.0 Sample mean n = 16 SD = 3.0 Sample mean n = 64 SD = 1.5

18 Distribution of the Sample Mean Given A variable with population that is not Normally distributed with mean  and standard deviation . A random sample of size n. Result The sample mean has approximate Normal distribution with Assume the population size is at least 20 times n.

19 Distribution of the Sample Mean Given A variable with population that is not Normally distributed with mean  and standard deviation . A random sample of size n. Result The sample mean has generally unknown distribution with

20 Distribution of the Sample Mean Given A variable with population that is not Normally distributed with mean  and standard deviation . A random sample of size n, where n is sufficiently large. Result The sample mean has approximate Normal distribution with Central Limit Theorem (CLT)

21 What is “Sufficiently Large?” Your book says “generally n at least 30.” If the population is fairly symmetric without outliers, considerably less than 30 will do the trick. If the population is highly skewed, or not unimodal, considerably more than 30 may be required. If the population is Normal then sample size is not a concern: The sample mean is Normal. You may use the “30” rule if you recognize that it’s not that black and white, and that for Normal populations, n = 1 is “sufficiently large.”

22 Example The Census Bureau reports the average age at death for female Americans is 79.7 years, with standard deviation 14.5 years.  = 79.7 years  = 14.5 years What can we say about the distribution of sample means for samples of size 7? It has mean It has standard deviation Is the distribution Normal?

23 Example Distribution of longevity:   80   15 Within 1 s.d.:

24 Example Distribution of longevity:   80   15 If Normal Within 1 s.d.: (65, 95)

25 Example Distribution of longevity:   80   15 If Normal Within 1 s.d.: (65, 95)  68%

26 Example Distribution of longevity:   80   15 If Normal Within 1 s.d.: (65, 95)  68% Within 2 s.d.s:(50, 110)  95%

27 Example Distribution of longevity:   80   15 If Normal Within 1 s.d.: (65, 95)  68% Within 2 s.d.s:(50, 110)  95% Above 110

28 Example Distribution of longevity:   80   15 If Normal Within 1 s.d.: (65, 95)  68% Within 2 s.d.s:(50, 110)  95% Above 110  2.5%

29 Example Distribution of longevity:   80   15 If Normal Within 1 s.d.: (65, 95)  68% Within 2 s.d.s:(50, 110)  95% Above 110  2.5% 1 in 40 ??? No way! The distribution is not Normal.

30 Example The Normal shouldn’t be used here (why not?)

31 Example The Normal shouldn’t be used here (why not?) The distribution of age at death is not Normal. It is quite left skewed. The sample size is not sufficiently large. (At least 30 by your book, although for this situation your instructor would probably buy into as low as 20.)  The Central Limit Theorem can’t be applied. The sample mean doesn’t have approximate Normal distribution

32 Example What can we say about the distribution of sample means for samples of size 7? It has mean It has standard deviation Is the distribution Normal? NO!

33 Example  = 79.7 years  = 14.5 years I looked at a few recent obituaries in the Oswego Daily News (online):

34 Example This sample has. A difference of 8.7. Can we compute a Z score for 71.0? Should we? Z = (71.0 – 79.7) /5.48 = 8.7/5.48 = –1.59 Why not? This suggests 71.0 (8.7 from 79.7) is somewhat, but not extremely, unusually low is 1.59 standard deviations from 79.7.

35 Example Should we use the Table to obtain probabilities from Z scores (such as our Z = –1.59)? NO If not, how could we get the probability of a result within 8.7 from 79.7? Using either a huge database of longevities: Simulate many (all possible) samples of size 7. Determine what proportion of samples give a mean at no more than 8.7 from a mathematical model for the longevities Either determine the model for sample means using calculus, or approximate it using numerical methods. Preferred method: Much more compact; faster to work with; essentially identical results.

36 Example What is the distribution of the sample mean of samples of size n = 48? Even though age at death is left skewed, with n = 48 (large enough) the Central Limit Theorem applies, and the sample mean has approximate Normal distribution.

37 Example I looked at 41 more recent obituaries (total of 48) more  data

38 Example Mean Median Mode

39 Example Means for samples of 48 US longevities: Normal My sample The sample mean is (79.7 – 77.52) = 2.18 from the population mean. What is the probability that a random sample of 48 U.S. women’s deaths gives a sample mean at within 2.18 of below 79.7 is above 79.7 is 81.88

40 Example Below or above Z =  2.18/2.08 =  1.04 Probability = – = 0.704

41 Example Find the probability that a random sample of 48 U.S. women’s deaths gives a sample mean at within 2.18 of Probability = About 30% (that’s almost 1 in 3) of all samples of 48 deaths give a sample mean more than 2.18 from 79.7.

42 Example Give two explanations that account for the 2.18 year difference between the data on Oswego longevity (which were lower on average) and the U.S. longevity parameter of Women in Oswego do not live as long on average as they do nationwide. That is:  Oswego < 79.7

43 Example Give two explanations that account for the 2.18 year difference between the data on Oswego longevity (which were lower on average) and the U.S. longevity parameter of Sampling variability (sampling “error”):  Oswego = 79.7 About 30% of all samples of 48 women yield a mean 2.18 or more from That isn’t so uncommon. Our data aren’t very inconsistent with the national result.

44 Sampling Without Replacement What to do if the sample size is more than 5% of the population size… N= population size n = sample size N / n  20n / N ≤ 0.05

45 Distribution of Sample Means The distribution of the sample mean has > mean (“unbiased”) > standard deviation > shape closer to Normal (but not necessarily Normal)

46 Word Lengths – Gettysburg Address N = 268 words:Mean length  = Standard Deviation  = Not Normal. Right skewed. Can’t use Table A2.

47 Distribution of Sample Means: n = 5 Sample means from samples of size n = 5 have > mean > standard deviation > shape closer to Normal (but not Normal – a bit right skewed)

48 Distribution of Sample Means : n = 5 The mean of this distribution is The standard deviation of this distribution is The shape is close to Normal (but not Normal – there’s right skew).

49 Distribution of Sample Means : n = 10 Sample means from samples of size n = 10 have > mean > standard deviation > shape closer to Normal (but not exactly Normal – a bit right skewed)

50 The mean of this distribution is The standard deviation of this distribution is The shape is quite close to Normal (just a little right skew – not enough to fuss over). Distribution of Sample Means : n = 10

51 n = 5 n = 10 Awful close to  1

52 n = 5 n = 10 Almost the same.

53 n = 100 Not so close to  1 Not almost the same.

54 Distribution of the Sample Mean Given A variable with population that is distributed with mean  and standard deviation . A random sample of size n. PARAMETERS Results 1 and 2 STATISTIC The sample mean has distribution with the same mean and a smaller standard deviation.

55 Distribution of the Sample Mean Given A variable with population that is distributed with mean  and standard deviation . A random sample of size n. Results 3 The sample mean has distribution with a shape that is closer to Normal.