INTRODUCTION TO ECONOMIC STATISTICS Topic 7 The Central Limit Theorem These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative.

Slides:



Advertisements
Similar presentations
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Advertisements

Sampling Distributions Martina Litschmannová K210.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 10: Sampling and Sampling Distributions
Suppose we are interested in the digits in people’s phone numbers. There is some population mean (μ) and standard deviation (σ) Now suppose we take a sample.
Chapter 6 Introduction to Sampling Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
McGraw-Hill-Ryerson © The McGraw-Hill Companies, Inc., 2004 All Rights Reserved. 7-1 Chapter 7 Chapter 7 Created by Bethany Stubbe and Stephan Kogitz.
Today Today: Chapter 8, start Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Sample Distribution Models for Means and Proportions
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Chapter 6 Sampling and Sampling Distributions
Chapter 7 Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
AP Statistics Chapter 9 Notes.
Introduction to Inferential Statistics. Introduction  Researchers most often have a population that is too large to test, so have to draw a sample from.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Stat 13, Tue 5/8/ Collect HW Central limit theorem. 3. CLT for 0-1 events. 4. Examples. 5.  versus  /√n. 6. Assumptions. Read ch. 5 and 6.
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
Sampling Distribution of the Sample Mean. Example a Let X denote the lifetime of a battery Suppose the distribution of battery battery lifetimes has 
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Chapter 6.3 The central limit theorem. Sampling distribution of sample means A sampling distribution of sample means is a distribution using the means.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Introduction to Inference Sampling Distributions.
Sampling Distributions Sampling Distributions. Sampling Distribution Introduction In real life calculating parameters of populations is prohibitive because.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Sampling Distributions Chapter 18. Sampling Distributions If we could take every possible sample of the same size (n) from a population, we would create.
Example Random samples of size n =2 are drawn from a finite population that consists of the numbers 2, 4, 6 and 8 without replacement. a-) Calculate the.
INTRODUCTION TO RESEARCH METHODS IN ECONOMICS Topic 5 Data Collection Strategies These slides are copyright © 2010 by Tavis Barr. This work is licensed.
INTRODUCTION TO ECONOMIC STATISTICS Topic 8 Confidence Intervals These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative.
INTRODUCTION TO ECONOMIC STATISTICS Topic 5 Discrete Random Variables These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative.
INTRODUCTION TO ECONOMIC STATISTICS Topic 9 One-Sample Hypothesis Tests These slides are copyright © 2010 by Tavis Barr. This work is licensed under a.
Sampling Distributions
Chapter 6: Sampling Distributions
Sampling Distributions – Sample Means & Sample Proportions
And distribution of sample means
Sampling Distributions
Introduction to Inference
Sampling Distribution Models
Sampling Distributions
Sampling Distributions and Estimation
6-3The Central Limit Theorem.
Chapter 6: Sampling Distributions
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Sampling Distributions
Sampling Distributions
Sample Mean Distributions
Sampling Distributions
Chapter 7 Sampling Distributions.
Handout THQ #5 at end of class.
Chapter 7 Sampling Distributions
Chapter 7 Sampling Distributions.
Inferential Statistics and Probability a Holistic Approach
Review of Hypothesis Testing
Econ 3790: Business and Economics Statistics
Sampling Distribution Models
Chapter 7 Sampling Distributions.
Sampling Distributions
Lecture 7 Sampling and Sampling Distributions
AGENDA: DG minutes Begin Part 2 Unit 1 Lesson 11.
Chapter 7 Sampling Distributions.
Estimates and Sample Sizes Lecture – 7.4
Day 13 AGENDA: DG minutes Begin Part 2 Unit 1 Lesson 11.
Chapter 7 Sampling Distributions.
Presentation transcript:

INTRODUCTION TO ECONOMIC STATISTICS Topic 7 The Central Limit Theorem These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative Commons Attribution- ShareAlike 3.0 Unported License. See for further information.

The Central Limit Theorem ● Making Data Conform to Probability Theory ● The Law of Large Numbers ● The Central Limit Theorem – Known Population Mean and Variance – Known Population Mean, Unknown Variance

From Probability to Statistics ● Probability theory tells us how samples behave when we know population parameters ● Such problems are unusual because we don't usually know these parameters ● How can we get the variables in our sample data to act like the random variables in probability theory?

The i.i.d. assumption ● We make two assumptions: Observations in a sample are independent and identically distributed ● This property is abbreviated as i.i.d. (lower case). An i.i.d. sample is sometimes called a random sample.i.i.d

Independent Observations ● Observations are independent when knowing the value of one observation in a sample does not tell us anything about the value of other observations in that sample ● Some exceptions: – Time series data: Values in nearby years are correlated – Panel data (people, states, stores, etc.) followed over time): Characteristics of individuals are persistent over time

Identical Distribution ● Observations are identically distributed if they are all draws from a random variable with the same distribution and parameters – Again, panel data is an exception ● We don't necessarily know what the distribution is (Normal, binomal, etc. or something unusual); we just assume that it's always the same one.

How to achieve an iid sample? ● One method: A probability sample. Each member of the population is observed with equal probability ● If the population comes from a single distribution, then the sample will be a set of i.i.d. observations

How to achieve an iid sample? ● Whether the population comes from a single distribition can be a matter of perspective – Many populations have subgroups (region, demographic group, etc.) – We might look at subgroups separately if: ● Differences are systematic ● We have good information on the sub-group level ● We have enough information about each sub-group

How to Construct a Probability Sample? ● One method: Assign a number to every member in the population, write them out in random order, and pick every 10 th or 20 th or 50 th member – For example, if everyone has a phone number, pick phone numbers at random – Or, if every student has a Social Security number, pick Social Security numbers at random

How to Construct a Probability Sample? ● Stratified Sampling: – Divide population into groups, choose any given group with probability proportionate to group size – Construct a probability sample within each group – Choose a sample size for each group in proportion to its size in the population ● Examples: – Divide country into area codes and phone exchanges, sample evenly within exchanges – Divide country into colleges, sample evenly within college in proportion to college size

Central Limit Theorem ● It turns out that if our samples are independent and identically distributed, we can predict the behavior of large samples. ● The law of large numbers and the central limit theorem are two of the basic ways of doing this

Central Limit Theorem – Assumptions ● We need two assumptions for the Law of Large Numbers and the Central Limit Theorem to work: 1.The sample is i.i.d. 2.The variable that the sample is from has a finite population mean and variance ● Infinite means and variances can create problems in physics; not so much in business and economics. The i.i.d. assumption is more important.

Law of Large Numbers ● The law of large numbers says that if these two assumptions are satisfied, then the sample mean approaches the population mean with probability one as the sample becomes infinitely large. ● This is of limited practical use because it doesn't tell us how close the sample mean gets and how big the sample has to be.

Central Limit Theorem Background ● The Central Limit Theorem assumes that we're looking at a variable with population mean and population variance 2. ● If a sample is a sample of draws from a random variable, then the sample mean, X, is an arithmetic function of that variable. ● So it's essentially a draw from a slightly different random variable

Central Limit Theorem Background ● If we think of the sample mean as a variable, then we call its mean the expected value and its standard deviation the standard error. ● The Central Limit Theorem has the same two requirements as the Law of Large Numbers (random sample; finite mean and variance). Additionally, it requires that the sample size is at least 30.

Central Limit Theorem – Result The Central Limit Theorem states that if these assumptions are satisfied, then: 1. The sample mean is Normally distributed, regardless of the distribution of the original variable 2. The sample mean has the same expected value as the population mean, i.e., 3.The standard error of the sample mean is

Example of Central Limit Theorem ● According to the New York Post, there were 0.44 pedestrian deaths per day in New York City in 2006 Source: dozien.htm ● Pedestrian deaths are likely to follow a Poisson distribution. In this case, the standard deviation would be....

Example of Central Limit Theorem ● There were 0.44 pedestrian deaths per day in New York City in 2006, with a standard deviation of 0.66 ● Suppose we choose a sample of 49 random days in 2006 ● What is the probability that the average death rate on those 49 days is 0.5 or lower? – Remember: X ~ N(,σ/√n )

Example of Central Limit Theorem ● The mean annual rainfall in Williamsburg, Virginia is 4.19 inches, and the standard deviation is Source: ● Suppose that we survey Williamsburg on 36 random days ● What is the probability that the average rainfall on those 36 days is at least 5 inches? – Remember: X ~ N(,σ/√n )

Example of Central Limit Theorem ● Suppose we produce soda. Our quality control engineer claims that our bottles of soda have a mean contents of 2000ml and a standard deviation of 2 ml. ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are ml or less?

Example of Central Limit Theorem ● Suppose we produce soda. Our quality control engineer claims that our bottles of soda have a mean contents of 2000ml and a standard deviation of 2 ml. ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are ml or less?

Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are ml or less? – The sample mean will be normally distributed. It will have an expected value of 2000, and a standard error of 2/√100 = 0.2

Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are ml or less? – The sample mean will be normally distributed. It will have an expected value of 2000, and a standard error of 2/√100 = 0.2 – So we want to know the probability that a Normally distributed variable with mean 2000 and standard deviation 0.2 is less than

Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are ml or less? – So we want to know the probability that a Normally distributed variable with mean 2000 and standard deviation 0.2 is less than – This is the same as the probability that a standard normal variable is less than ( )/0.2 = -2.5.

Another Example of CLT ● Suppose we know that the mean marital age of men in the U.S. is 24.8 years and the standard deviation is 2.5 years. ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more?

Another Example of CLT ● Suppose we know that the mean marital age of men in the U.S. is 24.8 years and the standard deviation is 2.5 years. ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Sample mean will be a Normal variable with mean 24.8 and standard deviation 2.5/√60 = 2.5/7.75=0..32

Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Sample mean will be a Normal variable with mean 24.8 and standard deviation 2.5/√60 = 2.5/7.75=0..32 – What is the probability that a Normal variable with mean 24.8 and standard devation 0.32 is at least 25.1?

Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – What is the probability that a Normal variable with mean 24.8 and standard devation 0.32 is at least 25.1? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 =

Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = – From the table, P(z<.94) is 0.826

Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = – From the table, P(z<.94) is – So P(z >.94) = 1 – =

What if we don't know ? ● Sometimes we know the population mean, but not the population standard deviation ● In this case, we can substitute the sample standard deviation, s, for the population standard deviation. ● Then, the result is that the sample mean is normally distributed with expected value and standard error s/√n

Example with unkown ● According to the United States Statistical Abstract, the average American consumed pounds of red meat per day in Source: ● A random sample of 300 Americans finds that the average person consumed 0.4 pounds on that day, with a standard deviation of ● How probable is a sample mean this large or larger? (Remember: X ~ N(,σ/√n ))

Example with unkown ● According to the BJS, the average length of a prison sentence in 2004 was 57 months. Source: ● In a random sample of 200 prisoners, the average sentence is 60 months and the standard deviation was 25 months. ● What is the probability of obtaining a sample mean of between 54 and 60 months? (Remember: X ~ N(,σ/√n ) )

Example with unkown ● Suppose a company claims that its light bulbs last an average of a thousand hours. ● We take a sample of 500 light bulbs. The average bulb in the sample lasts 950 hours, and the sample standard deviation is 100 hours. ● What is the probability of observing a sample mean this small?

Example with unkown ● Suppose a company claims that its light bulbs last an average of a thousand hours. ● We take a sample of 500 light bulbs. The average bulb in the sample lasts 950 hours, and the sample standard deviation is 100 hours. ● What is the probability of observing a sample mean this small? – Here = 1000, =?, n = 500, X = 950, s = 100

Example with unkown ● Recap: – Population mean () of 1000, population standard deviation () unknown – Sample size (n) 500, sample mean (X) 950, sample standard deviation (s) 100 ● What is the probability of X this small or smaller? – X is Normal with mean 1000, std error 100/√500 = 100/22.36 = 4.47 – P( <950) is the same as P(z < [ ]/4.47), i.e., P( z < )