Presentation is loading. Please wait.

Presentation is loading. Please wait.

Welcome to Econ 420 Applied Regression Analysis Study Guide Week Two Ending Sunday, September 9 (Note: You must go over these slides and complete every.

Similar presentations


Presentation on theme: "Welcome to Econ 420 Applied Regression Analysis Study Guide Week Two Ending Sunday, September 9 (Note: You must go over these slides and complete every."— Presentation transcript:

1

2 Welcome to Econ 420 Applied Regression Analysis Study Guide Week Two Ending Sunday, September 9 (Note: You must go over these slides and complete every task outlined here by the end of the day on September 8)

3 Last week I asked you to report your heights and weights before Sunday September 2 –That meant by the end of the day on Saturday, September 1. –I did not hear from 4 of the students who are registered in this class Remember that this affects your grade

4 Here is our sample data on height and weight. ObservationHeight (H or X)Weight (W or Y) 1.Jackie64130 2. Philip D.75210 3. Bryan76230 4. Rita67190 5. Shane68175 6. Keith75190 7. Kelsie65145 8. Di72185

5 Assignment 1(Carries 30 points and is due before noon on Thursday, September 6) 1.Use the data set on the previous slide and the formulas on Page 8 (1-5 and 1-6) to estimated the coefficients β 0 ^ and β 1 ^ in the equation below W = β 0 ^ + β 1 ^ H –Make sure to show your work. –Do the estimated coefficients make sense to you? –What is the meaning of the estimated coefficients?

6 Assignment 1 continued 2. Answer Question 5 on Page 15 3. Answer Question 8 on Page 15 Type your answers and send them to me as an email attachment. Remember that I have an old version of word (2003). If you are using a newer version of word, you will need to save your work in the old format.

7 Note: The following notes are not going to take the place of the discussions covered in your text books First read the book Then look at the notes

8 Total, Explained and Residual Sum of Squares (PP11-13) Remember our height/weight example What is the average weight of the class? Duplicate the graph on Page 12 where Y is the weight and X is the height –The Fitted Line will be upward sloping –The Average Line (average weight) will be horizontal

9 Suppose instead of using the fitted line to predict someone’s weight we use the average line Y is the actual weight of a person. Y^ is the predicted weight according to the fitted line. Y bar is the average weight in the sample. (Y – Ybar) is how much the weight of a given individual is different from the average. (Y^ - Ybar) is how much our fitted line is closer to the actual weight than the average weight. (Y – Y^) is our residual –The portion of the weight that was not predicted (explained) by our fitted line

10 Remember we have 8 observations in our sample Some of our weights are below average and some are above average. Look at Equation 1-8, Page 12 –The reason why we square (Y – Ybar), (Y^ - Ybar) and (Y – Y^) is because we do not want the positive differences to cancel the negative differences Note: the best fitted line will be the one with the lowest (Y – Y^) 2

11 Multiple Regression Model (Chapter 2, PP20-29) Is height the only factor affecting weight? –Of course not. –What are some other factors affecting an individual’s weight? Age Calorie in take per day ……

12 So a better model will be Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + e –Where Y is weight and X1 through X 3 are Wight, Age, and Calorie intake. We will use EViews to estimate the coefficients of the a multiple regression model.

13 The meaning of the estimated coefficients Our estimated equations will be Y^ = β 0 ^ + β 1 ^ X 1 + β 2 ^ X 2 + β 3 ^ X 3 –Bonus: Can someone tell me why didn’t I put an “e” at the end of the above equation? β 1 ^ measures the effect of one more inch of height on weight, holding the age and the calorie intake constant and ignoring the effect of all other variables on weight. Similarly β 2 ^ measures the effect of one more year of age on weight, holding the weight and the calorie intake constant and ignoring the effect of all other variables on weight.

14 How big should the sample be? The bigger the sample the closer the β ^ will be to β. Rule of thumb: Degrees of Freedom >30 Degrees of Freedom = n- k-1 –Where n is the sample size and k is the number of independent variables.

15 The Classical Assumption Assumptions that have to be met in order for OLS to give us the best estimators.

16 Assumption 1 The regression equation Is linear in coefficients (not linear in variables) Is correctly specified (right functional form, no omitted variables, no irrelevant variables) Has additive error term

17 Assumption 2 Two or more independent variables are not perfectly correlated with each other. If violated  Perfect Multicollinearity Example Consumption = f (inflation, real interest rate, nominal interest rate, ….) Since real interest = nominal interest – inflations, The 3 independent variables are perfectly and linearly correlated with each other. When one independent variable changes, the others change too. OLS can not capture the effect of one variable in isolation

18 Assumption 3 No correlation between the explanatory (independent) variables and the error term What if it is violated? Example: Salary = f (Education,….,GPA) What if people with low GPA lie about their GPAs? When GPA is low, the error is always positive Problem: OLS attributes the variation in salary to the variation in GPA while it is in part caused by the variation in error.

19 Assumption 4 The error terms are uncorrelated with each other What if it is violated? Then we have autocorrelation (serial correlation) problem Example: Consumption = f (…., income) –Suppose we use time series data on the US economy to estimate the above model. Suppose that in 5 years of our study there was a war and consumption dropped significantly even though income didn’t. So, we will get negative errors during those years and they all seem to be correlated with each other.

20 Assumption 5 The error term must have a zero mean What if this assumption is violated This is not a big deal: the intercept will pick up the mean of the error term

21 Assumption 5 The error term has a constant variance What if it is violated? Problem of Heteroskedasticity Example: Consumption= f (…., income) –Suppose we use cross section data on various individuals to estimate the above model. People with low levels of income will probably spend most of their income. (The variance of the error is small) People with high levels of income may spend anywhere between 10% to 99% of their income. (The variance of the error is high.) (Figure 2-1)

22 Assumption 7 (Not Necessary) The error term is normally distributed What is a normal distribution? Symmetric, continuous, bell shaped Can be characterized by its mean and variance Must know if it is violated If violated, some statistical tests are not applicable As the size of sample goes up  the distribution becomes more normal


Download ppt "Welcome to Econ 420 Applied Regression Analysis Study Guide Week Two Ending Sunday, September 9 (Note: You must go over these slides and complete every."

Similar presentations


Ads by Google