Download presentation
1
Lecture 3 Cameron Kaplan
Econ 488 Lecture 3 Cameron Kaplan
2
Announcements Midterm Date Change: Now October 22
Syllabus will be updated soon. Library Session: October 8
3
How Regression Works Estimate the slope of the line that passes through the origin given the following data
4
How Regression Works Estimate the slope of the line that passes through the origin given the following data
5
How Regression Works Try this one…
Estimate the slope of the line that passes through the origin given the following data
6
How Regression Works Try this one…
Estimate the slope of the line that passes through the origin given the following data
7
Answers 1. (3,1): Slope = 1/3 2. (4,2): Slope = ½
How did you get that? Slope = Y/X
8
How Regression Works Now suppose we have 2 values:
Estimate the slope of the line that passes through the origin given the following data
9
Possible Estimators 1. Average of the two slopes.
10
Possible Estimators 2. Midpoint Estimator
11
Possible Estimators 2. Midpoint Estimator: Or = (1.5)/(3.5) = 3/7 Or = (1+2)/(3+4) = 3/7
12
Possible Estimators 3. Ordinary Least Squares (OLS)
We want a line that is as close as possible to all of the points
13
Ordinary Least Squares
We want to find a line that makes these residuals, e1and e2 as small as possible. e2 e1
14
Ordinary Least Squares
Equation of the line: (pronounced: “y i hat is equal to beta-hat x i”) The underlying data generating process is: (notice there are no hats) Finally, the observed values of X and Y can be described by: e is the “residual”, which is actually observed. is the “stochastic error term”, which is never observed
15
Ordinary Least Squares
By equations (1) and (3), we can see that: So, we want to choose a line so that ei is as small as possible. But, ei can be negative or positive, so we can’t just minimize ei.
16
Ordinary Least Squares
We could choose a that minimizes the absolute value of ei. That is, This is what is called the “Least Absolute Deviations” method. However, this is mathematically difficult, and there is another way that is better: Minimize ei2! ei2 is always positive, so we can minimize it.
17
Ordinary Least Squares
Choose a that minimizes Remember, We want to minimize the sum of this. This is equivalent to:
18
Ordinary Least Squares
Using calculus, the first order condition (FOC) for a minimum is that the first derivative is equal to zero. Take derivative with respect to Solve for :
19
Our Example What is the OLS slope estimate for our example? Y1=1, X1=3
So,
20
Possible Estimators Now we have 3 estimators:
1. Average of slopes to each point: OR =5/120.4167
21
Possible Estimator 2. Midpoint: = 3/7 0.4286
3. Ordinary Least Squares = 11/25 0.4444
22
Possible Estimator Which estimator is best? Let’s try an exercise.
23
OLS with an intercept term
24
Example Height and Shoe Size
25
Sum of Squares How much of the variation in the dependent variable is explained by the estimated regression equation? Total Sum of Squares (TSS) – How spread out are the y values in the sample? Explained Sum of Squares (ESS) – The sample variation in
26
Sum of Squares Residual Sum of Squares (RSS) – The sample variation in ei TSS= ESS+RSS Some of the variation in y can be explained by the regression, and some cannot If the RSS is small relative to the TSS, the equation is a good fit.
27
R-squared R-squared (or R2) is the proportion of the variation in Y that is explained by the regression. 0 ≤ R2 ≤ 1
28
R-squared
29
R-squared
30
R-squared
31
R-Squared
32
R-Squared
33
Multiple Regression Each coefficient is a partial regression coefficient β2 is the change in Y associated with a one unit increase in X2, holding the other X’s (i.e. X1, X3, X4, etc.) constant.
34
Multiple Regression Example
Suppose we run this regression, and get: This means that, on average, a one year increase in education is associated with a $0.599 per hour increase in wages, holding experience and tenure constant.
35
Degrees of Freedom How many more observations do you have to have above the number of coefficients you are trying to estimate? Can you estimate the slope and intercept given just one point? You always need at least as many observations as the number of coefficients you are estimating. But having more is better. Extra observations are extra degrees of freedom. Degrees of Freedom = n-k-1
36
R-squared vs. Adjusted R-squared
Whenever you add an extra variable, R2 will go up. Why? The extra variable will add at least some explanatory power to the regression. However, by adding another variable, you have an additional coefficient to estimate. Degrees of Freedom go down. So there is a benefit of adding an extra variable (R2 goes up) and a cost (d.f go down). Adjusted R2 adjusts the R2 to account for the loss in degrees of freedom.
37
Adjusted R-square Note that it is possible to get a negative adjusted R-squared
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.