Lecture 1 Cameron Kaplan Econ 488 Lecture 1 Cameron Kaplan
What is Econometrics? Applying quantitative and statistical methods to study economic principles. Econometrics has evolved as a separate discipline from statistics because it mainly focuses on non-experimental data Multiple regression is used in both econometrics and statistics, but the interpretation is different
What is Econometrics Economists have devised new techniques to deal with the complexities of economic data and to test predictions of economic theories
Uses of Econometrics Description of economic reality. Testing hypotheses about economic theory. Forecasting future economic activity
Probability Imagine two dice - a red die and a green die. We define a random variable X to be the sum of the two dice. e.g. if we roll a 5 on the red die, and a 2 on the green die, X=7.
Probability Distribution What is the probability the red die=2? 1/6 What is the probability the green die=5? What is the probability red = 2 and green = 5? 1/6*1/6 = 1/36
Probability Distribution What is the probability X = 2? 1/6*1/6 = 1/36 What is the probability X = 5?
Probability Distribution Green Red 1 2 3 4 5 6 1 2 3 4 5 6 7 2 3 4 5 6 7 8 3 4 5 6 7 8 9 4 5 6 7 8 9 10 5 6 7 8 9 10 11 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 3 4 5 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 3 4 5 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 4 5 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 4 5 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 4 5 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 5 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Probability Distribution X Freq Prob 2 1 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 Green Red 1 2 3 4 5 6 7 8 9 10 11 12
Expected Value of a Random Variable E(X) = x1*p1+x2*p2+ x3*p3+…+xn*pn E(X)= The expected value is also called the population mean, or x
Expected Value xi pi 2 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 xi pi 2/36 6/36 12/36 20/36 30/36 42/36 40/36 36/36 22/36 E(X) = 2/36 + 6/36 + 12/36 + 20/36 + 30/36 + 42/36 + 40/36 + 36/36 + 30/36 + 22/36 + 12/36 E(X) = 252/36 E(x) = 7
Population Variance (2 ) 2 = E[(X-) 2 ] 2 =
Standard Deviation () Population Variance xi pi 2 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 9 10 11 12 xi - -5 -4 -3 -2 -1 1 2 3 4 5 (xi -)2 25 16 9 4 1 (xi -)2 pi 25/36 32/36 27/36 16/36 5/36 2 = 210/36 2 5.83 Standard Deviation () = 2 = 5.83 2.41
Continuous Random Variables Imagine a variable that is equally likely to take on any value between 55 and 75.
Continuous Random Variables What is the probability X= 65 (exactly) Zero! We need to think about probabiliy in a range.
Continuous Random Variables f(x) = 0.05 for 55X75 f(x) = 0 otherwise What is the probability X is between 55 and 56? = 0.05
Continuous Probability Density Functions Probability Distributions can take on many shapes The area under the curve must sum to one.
Continuous Probability Density Functions What is f(x)? f(x) = 1.5 - 0.02X for 65X75 f(x) = 0 o.w.
The Normal Distribution (AKA Gaussian Distribution)
Central Limit Theorem The sum (or mean) of a large number of independent and identically distributed random variables will be distributed approximately normal.
Standard Normal Distribution
Standardized Normal Variable z = (x- )/ Pr[-1 < z < 1] = 0.6826 Pr[-2 < z < 2] = 0.9544 Pr[-3 < z < 3] = 0.9974
Height Analyzer Go to http://www.shortsupport.org Click on the “Research” Tab, and select height analyzer
Height Analyzer Men: Mean height = 5’8.5” Standard Dev = 2.75” Women: Mean height = 5’3.5” Standard Dev = 2.5” What is the probability that a random woman is between 5’1” and 5’3”?
Height Analyzer Convert to inches: 5’1” = 61” 5’3” = 63” 5’3.5” = 63.5” Standardize z1 = (61-63.5)/2.5 = -1 z2 = (63-63.5)/2.5 = -0.2 Look up both vales on the z table (pg. 621)
Area to the left = 0.1587
Area to the left = 0.4207
Shaded area = .4207 - 0.1587 = 0.262
Height Analyzer What percentage of men are taller than 6’4”? X = 6’4” = 76” = 5’8.5” = 68.5” Z = (76-68.5)/2.75 = 2.727 Only area to the right of 2.727 on standard normal curve is only 0.0032 Only 0.32% of men are taller than 6’4” (about one in 300)
Sampling This is the most important thing you could have learned from prob/stats. Population - entire group (e.g. height for the entire US population) Mean of population = Variance = 2
Sampling Sample - The part of the population you observe (e.g. the subjects in the NHANES) Sampling mean = Variance = s2 We use the sample to draw conclusions about the population
Sampling Distributions Suppose we want to estimate Sample Average = Suppose we want to know how good of an estimate x-bar is of We create the sampling distribution
Sampling Distributions Sampling Distribution - the probability distribution of all of the possible values of a statistic, in this case x-bar. Due to the central limit theorem, the sampling distribution of x-bar is approximately normal.
Estimators X-bar is an estimator of . Unbiased Estimator - An estimator is unbiased if it’s mean is equal to the population parameter. so x-bar is an unbiased estimator!
Standard Deviation As N increases, the standard deviation shrinks Also notice that we can’t calculate this unless we know the population parameter, which is almost never true.
Sampling Variance Notice that this is divided by N-1. If we divide by N, the estimator is too low.
Standard Error When the standard deviation of an estimator is estimated from the data it is called the standard error The standard error of