Chapter 7 Sampling Distributions Statistics for Business (Env) 1.

Slides:

Advertisements

Similar presentations

Chapter 9 Introduction to the t-statistic

Advertisements

Probability and Samples: The Distribution of Sample Means

9.1 confidence interval for the population mean when the population standard deviation is known

Sampling Distributions

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

Terminology A statistic is a number calculated from a sample of data. For each different sample, the value of the statistic is a uniquely determined number.

McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.

Chapter 7 Introduction to Sampling Distributions

Sampling Distribution of & the Central Limit Theorem.

QBM117 Business Statistics Statistical Inference Sampling Distribution of the Sample Mean 1.

Sampling Distributions

PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.

Part III: Inference Topic 6 Sampling and Sampling Distributions

QMS 6351 Statistics and Research Methods Chapter 7 Sampling and Sampling Distributions Prof. Vera Adamchik.

McGraw-Hill-Ryerson © The McGraw-Hill Companies, Inc., 2004 All Rights Reserved. 7-1 Chapter 7 Chapter 7 Created by Bethany Stubbe and Stephan Kogitz.

PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 5 Chicago School of Professional Psychology.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Chapter 7 Probability and Samples: The Distribution of Sample Means

Chapter 9: Introduction to the t statistic

Chapter 11: Random Sampling and Sampling Distributions

Chapter 5 DESCRIBING DATA WITH Z-SCORES AND THE NORMAL CURVE.

Chapter 6: Probability.

Chapter 5: z-scores.

Chapter 8 Introduction to Hypothesis Testing. Hypothesis Testing Hypothesis testing is a statistical procedure Allows researchers to use sample data to.

McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.

AP Statistics Chapter 9 Notes.

16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.

Estimates and Sample Sizes Lecture – 7.4

Understanding the scores from Test 2 In-class exercise.

Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

Chapter 6 Lecture 3 Sections: 6.4 – 6.5.

Sampling W&W, Chapter 6. Rules for Expectation Examples Mean: E(X) =  xp(x) Variance: E(X-  ) 2 =  (x-  ) 2 p(x) Covariance: E(X-  x )(Y-  y ) =

Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

Copyright © 2009 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.

Chapter 7: Sample Variability Empirical Distribution of Sample Means.

McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.

Chapter 7 Probability and Samples: The Distribution of Sample Means

Sampling Distribution and the Central Limit Theorem.

8- 1 Chapter Eight McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.

Chapter 7 Probability and Samples: The Distribution of Sample Means.

Distributions of the Sample Mean

Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.

Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.

Copyright © 2009 Pearson Education, Inc. 8.1 Sampling Distributions LEARNING GOAL Understand the fundamental ideas of sampling distributions and how the.

Introduction to Inference Sampling Distributions.

Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.

Unit 6 Section : The Central Limit Theorem  Sampling Distribution – the probability distribution of a sample statistic that is formed when samples.

SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.

SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.

8- 1 Chapter Eight McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.

Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.

Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.

Chapter 7: The Distribution of Sample Means

Chapter 8 Sampling Methods and the Central Limit Theorem.

Chapter 9 Introduction to the t Statistic

And distribution of sample means

Chapter 7 Probability and Samples

Sampling Distributions

Distribution of the Sample Means

Sampling Distributions and The Central Limit Theorem

Sampling Distributions

INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9

Econ 3790: Business and Economics Statistics

Sampling Distributions

Estimates and Sample Sizes Lecture – 7.4

Sampling Distributions and The Central Limit Theorem

Presentation transcript:

Chapter 7 Sampling Distributions Statistics for Business (Env) 1

Sampling Distributions 7.1The Sampling Distribution of the Sample Mean 7.2Central Limit Theorem 7.3STANDARD ERROR AND STATISTICAL INFERENCE 2

The sampling process A sample should be representative of the entire population, yet it is not expected to be identical to the population. 3

Sampling distribution Suppose that we draw all possible samples of size n from a given population. Suppose further that we compute a statistic (e.g., a mean, IQR, standard deviation) for each sample. The probability distribution of this statistic is called a sampling distribution. 4

5 Sampling error is the discrepancy, or amount of error, between a sample statistic and its corresponding population parameter. The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population.

The sampling distribution 6

Two different questions Data distribution P( X > 70) Distribution of sample means P( X > 70) 7

8 The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population.

Variability of a Sampling Distribution The variability of a sampling distribution depends on three factors: – N: The number of objects in the population. – n: The number of objects in the sample. – The way that the random sample is chosen. 9

Sample without replacement If the population size is much larger than the sample size, then the sampling distribution has roughly the same sampling error, whether we sample with or without replacement (population element can be selected only one time). On the other hand, if the sample represents a significant fraction (say, 1/10) of the population size, the sampling error will be noticeably smaller, when we sample without replacement. 10

Methods of Probability Sampling The sampling error is the difference between a sample statistic (e.g. X) and its corresponding population parameter(e.g.  ). The sampling distribution of the sample mean is the probability distribution of the population of the sample means obtainable from all possible samples of size n from a population of size N. 11

A population that consists of only 4 scores: 2, 4, 6, Mean=5 Example 1:

13 All the possible samples of n = 2 TABLE 7.1 Notice that the table lists random samples. This requires sampling with replacement, so it is possible to select the same score twice.

14 FIGURE 7.2 The distribution of sample means for n = 2. The distribution shows the 16 sample means from Table 7.1. Mean of sample mean = 5

15 Sample#1#2Sample mean Sampling without replacement: 4 C 2 = 4!/(2! 2!) = X f Mean of sample mean = 5

A law firm has five partners. At their weekly partners meeting each reported the number of hours they billed clients for their services last week. If two partners are selected randomly, how many different samples are possible? Example 1 Example 2: 16

5 objects taken 2 at a time. A total of 10 different samples Example 1 Sampling without replacement 17

Example 1 As a sampling distribution 18

Compute the mean of the sample means. Compare it with the population mean. The mean of the sample means The population mean Notice that the mean of the sample means is exactly equal to the population mean. 19

Example 3 Take another population: 3, 6, 9, 12, 15 Population size N=5, sample size n=2, mean=9, variance=18, SD= The number of possible samples which can be drawn without replacement is 5 C 2 =10 20

Variance =

Example 4: Sampling All Stocks Population of returns of all 1,815 stocks listed on NYSE for 1987 – See Figure on next slide – The mean rate of return m was –3.5% with a standard deviation s of 26% Draw all possible random samples of size n=5 and calculate the sample mean return of each – Sample with a computer – See Figure on next slide 22

Example: Sampling All Stocks 23

Results from Sampling All Stocks Observations – Both histograms appear to be bell-shaped and centered over the same mean of –3.5% – The histogram of the sample mean returns looks less spread out than that of the individual returns Statistics – Mean of all sample means: µ x = µ = -3.5% – Standard deviation of all possible means: 24

25 Examples above demonstrate the construction of the distribution of sample means for a relatively simple, specific situation. In most cases, however, it will not be possible to list all the samples and compute all the possible sample means. Therefore, it is necessary to develop the general characteristics of the distribution of sample means that can be applied in any situation. Fortunately, these characteristics are specified in a mathematical proposition known as the central limit theorem. This important and useful theorem serves as a cornerstone for much of inferential statistics.

General Conclusions 1.If the population of individual items is normal, then the population of all sample means is also normal 2.Even if the population of individual items is not normal, there are circumstances when the population of all sample means is normal (Central Limit Theorem) 26

27 Central Limit Theorem : For any population with mean  and standard deviation , the distribution of sample means for sample size n will have a mean of  and a standard deviation of and will approach a normal distribution as n becomes sufficiently large. The value of this theorem comes from two simple facts. First, it describes the distribution of sample means for any population, no matter what shape, or mean, or standard deviation. Second, the distribution of sample means “approaches” a normal distribution very rapidly. By the time the sample size reaches n > 30, the distribution is almost perfectly normal.

If the samples size is large enough (n  30), then we can consider the sample mean approximately follows a normal distribution f(X) ~ N   / n  This theorem also implies the variance of the sample mean is the population variance divided by n. (for large n) Central Limit Theorem Averages are less variable than individual observations. 28

Sample Means the sample size is large enough (n  30). Sample means follow the normal distribution under two conditions: the population itself follows the normal distribution OR 29

Distribution of data (normal distribution) Distribution of all possible sample means  x The distribution of sample means is less spread out. 30

31 The standard deviation of the distribution of sample means is called The standard error measures the standard amount of difference between and  that is reasonable to expect simply by chance. It should be intuitively reasonable that the size of a sample should influence how accurately the sample represents its population. Specifically, a large sample should be more accurate than a small sample. In general, as the sample size increases, the error between the sample mean and the population mean should decrease. This rule is also known as the law of large numbers.

32 The law of large numbers states that the larger the sample size (n), the more probable it is that the sample mean will be close to the population mean. The standard error provides a way to measure the “average” or standard distance between a sample mean and the population mean.

33 The distribution of sample means for random samples as the size n increases

34 The population of scores on the SAT forms a normal distribution with mean = 500 and sd = 100. If you take a random sample of n = 25 students, what is the probability that the sample mean would be greater than = 540? You can restate this probability question as : Out of all the possible sample means, what proportion has values greater than 540? Need to determine the distribution of the sample mean with n = 25. We know: 1. The distribution is normal because the population of SAT scores is normal. 2. The distribution has a mean of 500 because the population mean is The distribution has a standard error of 100/sqrt(25) Example 5:

35 The distribution of sample means for n = 25. Samples were selected from a normal population with mean = 500 and sd= 100. The next step is to use a z-score to locate the exact position of = 540 in the distribution.

36 The value 540 is located above the mean by 40 points, which is exactly 2 standard deviations (in this case, exactly 2 standard errors). Thus, the z-score for 540 is Because this distribution of sample means is normal, you can use the unit normal table to find the probability associated with z>2.00. The table indicates that of the distribution is located in the tail of the distribution beyond z>2.00. Our conclusion is that it is very unlikely, p = (2.28%), to obtain a random sample of n = 25 students with an average SAT score greater than 540.

Example 6 Suppose the mean selling price of a gallon of gasoline in the U.S. is $1.30. Further, assume the population  is $0.28. What is the probability that the mean of a sample of 35 gasoline stations is between $1.22 and $1.38? 37

The z-values corresponding to $1.22 and $1.38 are and 1.69 From the table for standard normal distribution We would expect about 91% of the sample means to be within $0.08 of the population mean. Example 2 38

Example 7 Assume that a school district has 10,000 sixth graders. In this district, the average weight of a sixth grader is 80 pounds, with a standard deviation of 20 pounds. Suppose you draw a random sample of 50 students. What is the probability that the average weight of a sampled student will be less than 75 pounds? 39

Example 7 cont. The standard deviation of the sampling distribution can be computed using the following formula. σ x = 20 * sqrt(1/50) = 20 * = The sampling distribution of the mean is normally distributed with a mean of 80 and a standard deviation of To find from table: P(z<(75-80)/2.83)=P(z<-1.77)=0.038

The Central Limit Theorem Random Sample (x 1, x 2, …, x n ) Population Distribution ( ,  ) (right-skewed) X as n  large Sampling Distribution of Sample Mean (nearly normal) 41

Example: Central Limit Theorem Simulation 42

Histogram of Population - Bimodal Distribution: population = 16,000; mean = std dev Sampling Distribution (from a bimodal population) n = 2: number of samples = 4000; mean = 4.977; std dev 3.017; 43

Sampling Distribution (from a bimodal population) n = 3: number of samples = 4000; mean = 4.946; std dev 2.425; Sampling Distribution n = 30: number of samples = 4000; mean = 5.032; std dev 0.722; 44

45 Most inferential statistics are used in the context of a research study. Typically, the researcher begins with a general question about how a treatment will affect the individuals in a population. For example, Will the drug affect blood pressure? Will the hormone affect growth? Will the special training affect students’ reading scores? STANDARD ERROR AND STATISTICAL INFERENCE : Standard error as a measure of chance

46

47 The question for the researcher is how to interpret the 4-point difference. Specifically, there are two possible explanations: 1. The treatment may have caused the scores in the sample to be 4 points higher. 2. The 4-point difference may be sampling error. Remember, a sample mean is not expected to be exactly the same as the population mean. Perhaps the treatment has no effect at all, and the 4-point difference has occurred just by chance. The standard error can help the researcher decide between these two alternatives. In particular, the standard error tells exactly how much difference is reasonable to expect just by chance. For example, if the standard error is only 1 point, then the researcher could conclude that the observed difference (4 points) is much larger than would be expected by chance. In this case, it would be reasonable to conclude that the treatment has caused the difference.

48 The standard error is reported in Scientific Journals in two ways. It may be reported in a table along with the sample means (see Table 7.2). Alternatively, the standard error may be reported in graphs.

49 Figure 7.8 illustrates the use of a bar graph to display information about the sample mean and the standard error. Note that the mean is represented by the height of the bar, and the standard error is depicted on the graph by brackets at the top of each bar. Each bracket extends 1 standard error above and 1 standard error below the sample mean.

50 Figure 7.9 shows how sample means and standard error are displayed on a line graph.

ONE Explain why sometime sampling is the only feasible way to learn about a population. TWO Define and construct a sample distribution of the sample mean. THREE Explain and apply the central limit theorem. FOUR STANDARD ERROR AND STATISTICAL INFERENCE Sampling Methods and the Central Limit Theorem Summary: Sampling Methods and the Central Limit Theorem 51