Lecture 3 Cameron Kaplan

Name: Lecture 3 Cameron Kaplan
Uploaded: 2017-12-15T05:00:05+00:00
Duration: PTM7S29
Description: Lecture 3 Cameron Kaplan

Lecture 3 Cameron Kaplan
Econ 488 Lecture 3 Cameron Kaplan

Announcements Midterm Date Change: Now October 22
Syllabus will be updated soon. Library Session: October 8

How Regression Works Estimate the slope of the line that passes through the origin given the following data

How Regression Works Try this one…
Estimate the slope of the line that passes through the origin given the following data

Answers 1. (3,1): Slope = 1/3 2. (4,2): Slope = ½
How did you get that? Slope = Y/X

How Regression Works Now suppose we have 2 values:
Estimate the slope of the line that passes through the origin given the following data

Possible Estimators 1. Average of the two slopes.

Possible Estimators 2. Midpoint Estimator

Possible Estimators 2. Midpoint Estimator: Or = (1.5)/(3.5) = 3/7 Or = (1+2)/(3+4) = 3/7

Possible Estimators 3. Ordinary Least Squares (OLS)
We want a line that is as close as possible to all of the points

Ordinary Least Squares
We want to find a line that makes these residuals, e1and e2 as small as possible. e2 e1

Equation of the line: (pronounced: “y i hat is equal to beta-hat x i”) The underlying data generating process is: (notice there are no hats) Finally, the observed values of X and Y can be described by: e is the “residual”, which is actually observed.  is the “stochastic error term”, which is never observed

By equations (1) and (3), we can see that: So, we want to choose a line so that ei is as small as possible. But, ei can be negative or positive, so we can’t just minimize ei.

We could choose a that minimizes the absolute value of ei. That is, This is what is called the “Least Absolute Deviations” method. However, this is mathematically difficult, and there is another way that is better: Minimize ei2! ei2 is always positive, so we can minimize it.

Choose a that minimizes Remember, We want to minimize the sum of this. This is equivalent to:

Using calculus, the first order condition (FOC) for a minimum is that the first derivative is equal to zero. Take derivative with respect to Solve for :

Our Example What is the OLS slope estimate for our example? Y1=1, X1=3
So,

Possible Estimators Now we have 3 estimators:
1. Average of slopes to each point: OR =5/120.4167

Possible Estimator 2. Midpoint: = 3/7 0.4286
3. Ordinary Least Squares = 11/25 0.4444

Possible Estimator Which estimator is best? Let’s try an exercise.

OLS with an intercept term

Example Height and Shoe Size

Sum of Squares How much of the variation in the dependent variable is explained by the estimated regression equation? Total Sum of Squares (TSS) – How spread out are the y values in the sample? Explained Sum of Squares (ESS) – The sample variation in

Sum of Squares Residual Sum of Squares (RSS) – The sample variation in ei TSS= ESS+RSS Some of the variation in y can be explained by the regression, and some cannot If the RSS is small relative to the TSS, the equation is a good fit.

R-squared R-squared (or R2) is the proportion of the variation in Y that is explained by the regression. 0 ≤ R2 ≤ 1

R-squared

R-Squared

Multiple Regression Each coefficient is a partial regression coefficient β2 is the change in Y associated with a one unit increase in X2, holding the other X’s (i.e. X1, X3, X4, etc.) constant.

Multiple Regression Example
Suppose we run this regression, and get: This means that, on average, a one year increase in education is associated with a $0.599 per hour increase in wages, holding experience and tenure constant.

Degrees of Freedom How many more observations do you have to have above the number of coefficients you are trying to estimate? Can you estimate the slope and intercept given just one point? You always need at least as many observations as the number of coefficients you are estimating. But having more is better. Extra observations are extra degrees of freedom. Degrees of Freedom = n-k-1

R-squared vs. Adjusted R-squared
Whenever you add an extra variable, R2 will go up. Why? The extra variable will add at least some explanatory power to the regression. However, by adding another variable, you have an additional coefficient to estimate. Degrees of Freedom go down. So there is a benefit of adding an extra variable (R2 goes up) and a cost (d.f go down). Adjusted R2 adjusts the R2 to account for the loss in degrees of freedom.

Adjusted R-square Note that it is possible to get a negative adjusted R-squared

Lecture 3 Cameron Kaplan

Similar presentations

Presentation on theme: "Lecture 3 Cameron Kaplan"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 3 Cameron Kaplan

Similar presentations

Presentation on theme: "Lecture 3 Cameron Kaplan"— Presentation transcript:

Similar presentations

About project

Feedback