Distribution of the Sample Means

Slides:



Advertisements
Similar presentations
Chapter 6 – Normal Probability Distributions
Advertisements

Central Limit Theorem.
The standard error of the sample mean and confidence intervals
The standard error of the sample mean and confidence intervals
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
Sampling Distributions
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
CHAPTER 11: Sampling Distributions
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Slide Copyright © 2008 Pearson Education, Inc. Chapter 7 The Sampling Distribution of the Sample Mean.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
Anthony J Greene1 Where We Left Off What is the probability of randomly selecting a sample of three individuals, all of whom have an I.Q. of 135 or more?
Distribution of the Sample Means
Chapter 7 Probability and Samples: The Distribution of Sample Means.
Chapter 6.3 The central limit theorem. Sampling distribution of sample means A sampling distribution of sample means is a distribution using the means.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
From the population to the sample The sampling distribution FETP India.
Sec 6.3 Bluman, Chapter Review: Find the z values; the graph is symmetrical. Bluman, Chapter 63.
Chapter 7: Sampling Distributions Section 7.2 Sample Proportions.
Sampling Distribution of the Sample Mean
Sampling Distributions
Sampling Distributions
GOVT 201: Statistics for Political Science
And distribution of sample means
Confidence Intervals Topics: Essentials Inferential Statistics
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
INF397C Introduction to Research in Information Studies Spring, Day 12
The Normal Distribution: Comparing Apples and Oranges
St. Edward’s University
The Normal Distribution
Sampling Distributions
Sampling and Sampling Distributions
Chapter 5 Sampling Distributions
Introduction to Summary Statistics
Chapter 5 Sampling Distributions
Sampling Distributions
Introduction to Summary Statistics
The normal distribution
Introduction to Summary Statistics
Econ 3790: Business and Economics Statistics
Inferential Statistics
Chapter 5 Sampling Distributions
Confidence Intervals Topics: Essentials Inferential Statistics
Sampling distributions
Continuous Random Variables
Calculating Probabilities for Any Normal Variable
Sampling Distributions
Chapter 8: Estimating with Confidence
Sampling Distribution of the Mean
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 11: Sampling Distributions
Sampling Distributions (§ )
Chapter 8: Estimating with Confidence
Introduction to Sampling Distributions
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Advanced Algebra Unit 1 Vocabulary
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 5: Sampling Distributions
Presentation transcript:

Distribution of the Sample Means Topics: Essentials Distributions Sampling Error Distribution of the Sample Means Properties of the Distribution: 1) Mean 2) Std. Dev. 3) Central Limit Theorem A large sample in statistics Example: Calculating and Additional Topic

Essentials: Distribution of Sample Means (A distribution unlike others) Be able to explain what the Distribution of Sample Means represents. Know the three characteristics of this distribution. Be able to use a set of data demonstrate the calculation of the mean and standard deviation of this distribution. What is a statistically large sample?

Some Common Distribution Shapes CHAPTER 6: NORMALLY DISTRIBUTED VARIABLES (6.1) Want to recall from our work in Ch. 2…some common distribution shapes. At any given time in statistics, and in the world in general we can observe an enormous variety of variables. Many are different in the distribution that they form. For example: the generation of random numbers would follow a uniform distribution. Our exam scores and project scores thus far have followed a left skewed distribution. Some variables though (actually many, especially naturally occurring ones) follow what is called a Normal Distribution. You may be more familiar with the term “bell curve.” This is a special distribution, and probably the most important distribution in statistics. In statistics, we call variables whose distributions have this shape Normally Distributed Variables. We call the distribution shape, a Normal Curve.

Distribution of the Sample Means Sampling Error: the difference between the sample measure and the population measure due to the fact that a sample is not a perfect representation of the population. the error resulting from using a sample to estimate a population characteristic. Recall: so far, we’ve been talking about variables (we’ve usually called x), and the distribution of these variables. We’ve stated that the distribution of x has a mean we call mu, and a standard deviation we call sigma. We’ve also stated that we can use a sample to acquire information about a population, and that this is most often preferable, since an entire census is often impossible. This however, poses a problem, since the sample provides data for only a portion of the entire population. We cannot expect one sample to give us perfectly accurate information about the population of interest. There is a certain amount of error that will result simply because we are sampling. Hopefully, you will recall from earlier in the course, this type of error is called sampling error. For Example: The Census Bureau publishes figures on the mean income of U.S. households. In 1993, the figure published was $41,428. This figure is the sample mean (x-bar) income of the 60,000 households, NOT the population mean mu of all U.S. households, but we may ask ourselves, how accurate are such estimates likely to be? Is the estimate within $1,000, $5,000, etc.? In order to answer this question, we would need to know the distribution of all possible sample means that could be obtained by sampling the incomes of 60,000 households. This distribution is called the distribution of the sample mean. Let’s look at an example.

Distribution of the Sample Means Distribution of the Sample Means – is a distribution obtained by using the means computed from random samples of a specific size taken from a population. Distribution of the Sample Mean, – the distribution of all possible sample means for a variable x, and for a given sample size. Recall: so far, we’ve been talking about variables (we’ve usually called x), and the distribution of these variables. In other words, how they vary about the mean. In addition to knowing how individual data values vary about the mean for a population, we are sometimes also interested in knowing about the distribution of the means of samples taken from a population. For example, Suppose a researcher selects 100 samples of a given size from a large population and computes the mean for each of the 100 samples. The values of these 100 means constitute a sampling distribution of sample means. If the sample means are randomly selected, the sample means, for the most part, will be somewhat different from the population mean mu. These differences are caused by sampling error.

Properties of the Distribution of Sample Means The mean of the sample means will be the same as the population mean. The standard error of the sample means will be smaller than the standard deviation of the population, and it will be equal to the population standard deviation divided by the square root of the sample size.

Standard Error vs. Standard Deviation Standard error of mean versus standard deviation. ... Put simply, the standard error of the sample mean is an estimate of how far the sample mean is likely to be from the population mean, whereas the standard deviation of the sample is the degree to which individuals within the sample differ from the sample mean.

A Third Property of the Distribution of Sample Means A third property of the distribution of the sample means concerns the shape of the distribution, and is explained by the Central Limit Theorem.

The Central Limit Theorem As the sample size n increases, the shape of the distribution of the sample means taken from a population with mean and standard deviation will approach a normal distribution. This distribution will have mean and standard deviation We can use the Central Limit Theorem to answer questions about sample means in the same way that the normal distribution can be used to answer questions about individual values. The only difference is that a new formula must be used to obtain z-scores.

Two Important Things to Remember When Using The Central Limit Theorem When the original variable is normally distributed, the distribution of the sample means will be normally distributed, for any sample size n. When the distribution of the original variable departs from normality, a sample size of 30 or more is needed to use the normal distribution to approximate the distribution of the sample means. The larger the sample, the better the approximation will be.

An Example Suppose I give an 8-point quiz to a small class of four students. The results of the quiz were 2, 6, 4, and 8. We will assume that the four students constitute the population.

The Mean and Standard Deviation of the Population (the four scores) The mean of the population is: The standard deviation of the population is:

Distribution of Quiz Scores A graph of the distribution of quiz scores.

All Possible Samples of Size 2 Taken With Replacement SAMPLE MEAN SAMPLE MEAN 2,2 2 6,2 4 2,4 3 6,4 5 2,6 4 6,6 6 2,8 5 6,8 7 4,2 3 8,2 5 4,4 4 8,4 6 4,6 5 8,6 7 4,8 6 8,8 8 All possible samples of size 2 taken with replacement.

Frequency Distribution of the Sample Means MEAN f 2 1 3 2 4 3 5 4 6 3 7 2 8 1 Shows the number of times each mean occurred.

Distribution of the Sample Means Note the shape here.

The Mean of the Sample Means Denoted In our example: So, , which in this case = 5

The Standard Error of the Sample Means Denoted In our example: Which is the same as the population standard deviation divided by Comment on sampling without replacement which is more the norm, and the Finite Population Correction Factor.

Additional Topics: Using these five players, let’s obtain the distribution of the sample means for samples of size n = 2. This means that we need to determine all possible samples of size 2 from this population of five players. How many samples of size 2 are there? Recall 5 2 Since there are only 10 possible samples of size 2, we can list them quite easily. Let’s look, but before we do, let’s calculate the mean height mu (why mu as opposed to x-bar?), of our five starting players. Mu = 400/5 = 80

Calculating the Standard Error STANDARD ERROR CALCULATION Procedure: Step 1: Calculate the mean (Total of all samples divided by the number of samples). Steps 2 – 6 use the definition formula to calculate the standard deviation Step 2: Calculate each measurement's deviation from the mean (Mean minus the individual measurement). Step 3: Square each deviation from mean. Squared negatives become positive. Step 4: Sum the squared deviations (Add up the numbers from step 3). Step 5: Divide that sum from step 4 by one less than the sample size (n-1, that is, the number of measurements minus one) Step 6: Take the square root of the number in step 5. That gives you the "standard deviation (S.D.)." Step 7: Divide the standard deviation by the square root of the sample size (n). That gives you the “standard error”.

Heights of Five Starting Players on a Men’s Basketball Team (inches) Demonstration showing increasing sample size yielding better estimations of the population value. Heights of Five Starting Players on a Men’s Basketball Team (inches) Using these five players, let’s obtain the distribution of the sample means for samples of size n = 2. This means that we need to determine all possible samples of size 2 from this population of five players. How many samples of size 2 are there? Recall 5 2 Since there are only 10 possible samples of size 2, we can list them quite easily. Let’s look, but before we do, let’s calculate the mean height mu (why mu as opposed to x-bar?), of our five starting players. Mu = 400/5 = 80

Possible Samples of Size n = 2 From a Population of Size N = 5 Here we see the 10 possible samples of size 2 in the first column. The heights for each member of the sample are listed in the second column. The mean of each sample is listed in the third column. We can make some simple, but significant observations about sampling error here, when the mean height of a random sample of 2 players is used to estimate the population mean height. Now, we’ve said before, that it is unlikely that a sample will produce the exact mean as the population (remember, here it is 80). We see in fact, that in only 1/10 or 10% of the samples does the sample mean x-bar equal the population mean mu. This is an example of sampling error. Now let’s look at what would happen if we used samples of size n = 4.

Possible Samples of Size n = 4 From a Population of Size N = 5 There are 5 possible sample means of size 4. Here, none of the sample means of size 4 has a mean equal to the population mean of 80, but in general the sample means are all closer to the population mean, than were the sample means of size 2. Looking at what is going on here, using dotplots will help make this clearer.

Dot plots of the Sampling Distributions for Various Sample Sizes (N = 5) Explain what the dotplots represent. Notice what happens here: As sample size increases, the sample means cluster closer around the population mean. Thus, the larger the sample size, the smaller the sampling error tends to be, in estimating a population mean mu, by a sample mean x-bar. Recall from earlier in the course: bias (consistent over- or under-representation) and precision (how scattered or spread out the values of the sample statistic are). As we increase sample size, we decrease bias and increase precision. Let’s look at the dotplot for samples of size 4: We see that exactly 4 of the 5 possible samples have means within one inch of the population mean. So the probability is 4/5 or 80% that the sampling error made in estimating mu by x-bar, will be 1 inch or less. n = 12/5 = 40%, n = 23/10 = 30%, n = 35/10 = 50%, n = 44.5 = 80%, n = 5100%