Presentation is loading. Please wait.

Presentation is loading. Please wait.

INTRODUCTION TO ECONOMIC STATISTICS Topic 7 The Central Limit Theorem These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative.

Similar presentations


Presentation on theme: "INTRODUCTION TO ECONOMIC STATISTICS Topic 7 The Central Limit Theorem These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative."— Presentation transcript:

1 INTRODUCTION TO ECONOMIC STATISTICS Topic 7 The Central Limit Theorem These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative Commons Attribution- ShareAlike 3.0 Unported License. See http://creativecommons.org/licenses/by-sa/3.0/ for further information.http://creativecommons.org/licenses/by-sa/3.0/

2 The Central Limit Theorem ● Making Data Conform to Probability Theory ● The Law of Large Numbers ● The Central Limit Theorem – Known Population Mean and Variance – Known Population Mean, Unknown Variance

3 From Probability to Statistics ● Probability theory tells us how samples behave when we know population parameters ● Such problems are unusual because we don't usually know these parameters ● How can we get the variables in our sample data to act like the random variables in probability theory?

4 The i.i.d. assumption ● We make two assumptions: Observations in a sample are independent and identically distributed ● This property is abbreviated as i.i.d. (lower case). An i.i.d. sample is sometimes called a random sample.i.i.d

5 Independent Observations ● Observations are independent when knowing the value of one observation in a sample does not tell us anything about the value of other observations in that sample ● Some exceptions: – Time series data: Values in nearby years are correlated – Panel data (people, states, stores, etc.) followed over time): Characteristics of individuals are persistent over time

6 Identical Distribution ● Observations are identically distributed if they are all draws from a random variable with the same distribution and parameters – Again, panel data is an exception ● We don't necessarily know what the distribution is (Normal, binomal, etc. or something unusual); we just assume that it's always the same one.

7 How to achieve an iid sample? ● One method: A probability sample. Each member of the population is observed with equal probability ● If the population comes from a single distribution, then the sample will be a set of i.i.d. observations

8 How to achieve an iid sample? ● Whether the population comes from a single distribition can be a matter of perspective – Many populations have subgroups (region, demographic group, etc.) – We might look at subgroups separately if: ● Differences are systematic ● We have good information on the sub-group level ● We have enough information about each sub-group

9 How to Construct a Probability Sample? ● One method: Assign a number to every member in the population, write them out in random order, and pick every 10 th or 20 th or 50 th member – For example, if everyone has a phone number, pick phone numbers at random – Or, if every student has a Social Security number, pick Social Security numbers at random

10 How to Construct a Probability Sample? ● Stratified Sampling: – Divide population into groups, choose any given group with probability proportionate to group size – Construct a probability sample within each group – Choose a sample size for each group in proportion to its size in the population ● Examples: – Divide country into area codes and phone exchanges, sample evenly within exchanges – Divide country into colleges, sample evenly within college in proportion to college size

11 Central Limit Theorem ● It turns out that if our samples are independent and identically distributed, we can predict the behavior of large samples. ● The law of large numbers and the central limit theorem are two of the basic ways of doing this

12 Central Limit Theorem – Assumptions ● We need two assumptions for the Law of Large Numbers and the Central Limit Theorem to work: 1.The sample is i.i.d. 2.The variable that the sample is from has a finite population mean and variance ● Infinite means and variances can create problems in physics; not so much in business and economics. The i.i.d. assumption is more important.

13 Law of Large Numbers ● The law of large numbers says that if these two assumptions are satisfied, then the sample mean approaches the population mean with probability one as the sample becomes infinitely large. ● This is of limited practical use because it doesn't tell us how close the sample mean gets and how big the sample has to be.

14 Central Limit Theorem Background ● The Central Limit Theorem assumes that we're looking at a variable with population mean and population variance 2. ● If a sample is a sample of draws from a random variable, then the sample mean, X, is an arithmetic function of that variable. ● So it's essentially a draw from a slightly different random variable

15 Central Limit Theorem Background ● If we think of the sample mean as a variable, then we call its mean the expected value and its standard deviation the standard error. ● The Central Limit Theorem has the same two requirements as the Law of Large Numbers (random sample; finite mean and variance). Additionally, it requires that the sample size is at least 30.

16 Central Limit Theorem – Result The Central Limit Theorem states that if these assumptions are satisfied, then: 1. The sample mean is Normally distributed, regardless of the distribution of the original variable 2. The sample mean has the same expected value as the population mean, i.e., 3.The standard error of the sample mean is

17 Example of Central Limit Theorem ● According to the New York Post, there were 0.44 pedestrian deaths per day in New York City in 2006 Source:http://www.nypost.com/seven/04122007/news/regionalnews/pedestrian_deaths_drop_regionalnews_frankie_e dozien.htm ● Pedestrian deaths are likely to follow a Poisson distribution. In this case, the standard deviation would be....

18 Example of Central Limit Theorem ● There were 0.44 pedestrian deaths per day in New York City in 2006, with a standard deviation of 0.66 ● Suppose we choose a sample of 49 random days in 2006 ● What is the probability that the average death rate on those 49 days is 0.5 or lower? – Remember: X ~ N(,σ/√n )

19 Example of Central Limit Theorem ● The mean annual rainfall in Williamsburg, Virginia is 4.19 inches, and the standard deviation is 2.49. Source: http://ams.confex.com/ams/pdfpapers/28807.pdfhttp://ams.confex.com/ams/pdfpapers/28807.pdf ● Suppose that we survey Williamsburg on 36 random days ● What is the probability that the average rainfall on those 36 days is at least 5 inches? – Remember: X ~ N(,σ/√n )

20 Example of Central Limit Theorem ● Suppose we produce soda. Our quality control engineer claims that our bottles of soda have a mean contents of 2000ml and a standard deviation of 2 ml. ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less?

21 Example of Central Limit Theorem ● Suppose we produce soda. Our quality control engineer claims that our bottles of soda have a mean contents of 2000ml and a standard deviation of 2 ml. ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less?

22 Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? – The sample mean will be normally distributed. It will have an expected value of 2000, and a standard error of 2/√100 = 0.2

23 Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? – The sample mean will be normally distributed. It will have an expected value of 2000, and a standard error of 2/√100 = 0.2 – So we want to know the probability that a Normally distributed variable with mean 2000 and standard deviation 0.2 is less than 1999.5

24 Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? – So we want to know the probability that a Normally distributed variable with mean 2000 and standard deviation 0.2 is less than 1999.5 – This is the same as the probability that a standard normal variable is less than (1999.5- 2000)/0.2 = -2.5.

25 Another Example of CLT ● Suppose we know that the mean marital age of men in the U.S. is 24.8 years and the standard deviation is 2.5 years. ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more?

26 Another Example of CLT ● Suppose we know that the mean marital age of men in the U.S. is 24.8 years and the standard deviation is 2.5 years. ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Sample mean will be a Normal variable with mean 24.8 and standard deviation 2.5/√60 = 2.5/7.75=0..32

27 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Sample mean will be a Normal variable with mean 24.8 and standard deviation 2.5/√60 = 2.5/7.75=0..32 – What is the probability that a Normal variable with mean 24.8 and standard devation 0.32 is at least 25.1?

28 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – What is the probability that a Normal variable with mean 24.8 and standard devation 0.32 is at least 25.1? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = 0.9375.

29 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = 0.9375. – From the table, P(z<.94) is 0.826

30 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = 0.9375. – From the table, P(z<.94) is 0.826. – So P(z >.94) = 1 – 0.826 = 0.174.

31 What if we don't know ? ● Sometimes we know the population mean, but not the population standard deviation ● In this case, we can substitute the sample standard deviation, s, for the population standard deviation. ● Then, the result is that the sample mean is normally distributed with expected value and standard error s/√n

32 Example with unkown ● According to the United States Statistical Abstract, the average American consumed 0.307 pounds of red meat per day in 2004. Source:http://www.census.gov/compendia/statab/tables/07s0202.xlshttp://www.census.gov/compendia/statab/tables/07s0202.xls ● A random sample of 300 Americans finds that the average person consumed 0.4 pounds on that day, with a standard deviation of 0.20. ● How probable is a sample mean this large or larger? (Remember: X ~ N(,σ/√n ))

33 Example with unkown ● According to the BJS, the average length of a prison sentence in 2004 was 57 months. Source:http://www.ojp.usdoj.gov/bjs/pub/pdf/fssc04.pdf ● In a random sample of 200 prisoners, the average sentence is 60 months and the standard deviation was 25 months. ● What is the probability of obtaining a sample mean of between 54 and 60 months? (Remember: X ~ N(,σ/√n ) )

34 Example with unkown ● Suppose a company claims that its light bulbs last an average of a thousand hours. ● We take a sample of 500 light bulbs. The average bulb in the sample lasts 950 hours, and the sample standard deviation is 100 hours. ● What is the probability of observing a sample mean this small?

35 Example with unkown ● Suppose a company claims that its light bulbs last an average of a thousand hours. ● We take a sample of 500 light bulbs. The average bulb in the sample lasts 950 hours, and the sample standard deviation is 100 hours. ● What is the probability of observing a sample mean this small? – Here = 1000, =?, n = 500, X = 950, s = 100

36 Example with unkown ● Recap: – Population mean () of 1000, population standard deviation () unknown – Sample size (n) 500, sample mean (X) 950, sample standard deviation (s) 100 ● What is the probability of X this small or smaller? – X is Normal with mean 1000, std error 100/√500 = 100/22.36 = 4.47 – P( <950) is the same as P(z < [950-1000]/4.47), i.e., P( z < -11.18)


Download ppt "INTRODUCTION TO ECONOMIC STATISTICS Topic 7 The Central Limit Theorem These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative."

Similar presentations


Ads by Google