EC220 - Introduction to econometrics (chapter 2)

Slides:



Advertisements
Similar presentations
t TEST OF A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT
Advertisements

EC220 - Introduction to econometrics (chapter 2)
EC220 - Introduction to econometrics (chapter 1)
1 Although they are biased in finite samples if Part (2) of Assumption C.7 is violated, OLS estimators are consistent if Part (1) is valid. We will demonstrate.
ADAPTIVE EXPECTATIONS 1 The dynamics in the partial adjustment model are attributable to inertia, the drag of the past. Another, completely opposite, source.
EXPECTED VALUE RULES 1. This sequence states the rules for manipulating expected values. First, the additive rule. The expected value of the sum of two.
ADAPTIVE EXPECTATIONS: FRIEDMAN'S PERMANENT INCOME HYPOTHESIS
EC220 - Introduction to econometrics (chapter 14)
EC220 - Introduction to econometrics (chapter 11)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)
EC220 - Introduction to econometrics (chapter 2)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: probability distribution example: x is the sum of two dice Original.
EC220 - Introduction to econometrics (chapter 14)
EC220 - Introduction to econometrics (review chapter)
EC220 - Introduction to econometrics (chapter 11)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 9) Slideshow: two-stage least squares Original citation: Dougherty, C. (2012) EC220.
EC220 - Introduction to econometrics (review chapter)
EC220 - Introduction to econometrics (chapter 8)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: consequences of autocorrelation Original citation: Dougherty, C. (2012)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: Ramsey’s reset test of functional misspecification Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 11) Slideshow: model c assumptions Original citation: Dougherty, C. (2012) EC220 -
EC220 - Introduction to econometrics (chapter 8)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 8) Slideshow: model b: properties of the regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: one-sided t tests Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: one-sided t tests Original citation: Dougherty, C. (2012) EC220.
EC220 - Introduction to econometrics (chapter 1)
THE ERROR CORRECTION MODEL 1 The error correction model is a variant of the partial adjustment model. As with the partial adjustment model, we assume a.
1 MAXIMUM LIKELIHOOD ESTIMATION OF REGRESSION COEFFICIENTS X Y XiXi 11  1  +  2 X i Y =  1  +  2 X We will now apply the maximum likelihood principle.
EC220 - Introduction to econometrics (chapter 6)
EC220 - Introduction to econometrics (chapter 3)
EC220 - Introduction to econometrics (chapter 4)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 7) Slideshow: White test for heteroscedasticity Original citation: Dougherty, C. (2012)
1 This very short sequence presents an important definition, that of the independence of two random variables. Two random variables X and Y are said to.
EC220 - Introduction to econometrics (review chapter)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: asymptotic properties of estimators: the use of simulation Original.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: expected value of a random variable Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: exercise 2.16 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: population variance of a discreet random variable Original citation:
EC220 - Introduction to econometrics (chapter 5)
CHOW TEST AND DUMMY VARIABLE GROUP TEST
EC220 - Introduction to econometrics (chapter 5)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.7 Original citation: Dougherty, C. (2012) EC220 - Introduction.
EC220 - Introduction to econometrics (chapter 10)
EC220 - Introduction to econometrics (chapter 4)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: slope dummy variables Original citation: Dougherty, C. (2012) EC220 -
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
EC220 - Introduction to econometrics (chapter 7)
EC220 - Introduction to econometrics (chapter 2)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification iii: consequences for diagnostics Original.
EC220 - Introduction to econometrics (chapter 1)
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: semilogarithmic models Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: two sets of dummy variables Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 9) Slideshow: instrumental variable estimation: variation Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: multiple restrictions and zero restrictions Original citation: Dougherty,
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
Introduction to Econometrics, 5th edition
Presentation transcript:

EC220 - Introduction to econometrics (chapter 2) Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: f test of goodness of fit   Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 2). [Teaching Resource] © 2012 The Author This version available at: http://learningresources.lse.ac.uk/128/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/   http://learningresources.lse.ac.uk/

F TEST OF GOODNESS OF FIT In an earlier sequence it was demonstrated that the sum of the squares of the actual values of Y (TSS: total sum of squares) could be decomposed into the sum of the squares of the fitted values (ESS: explained sum of squares) and the sum of the squares of the residuals. 1

F TEST OF GOODNESS OF FIT R2, the usual measure of goodness of fit, was then defined to be the ratio of the explained sum of squares to the total sum of squares. 2

F TEST OF GOODNESS OF FIT The null hypothesis that we are going to test is that the model has no explanatory power. 3

F TEST OF GOODNESS OF FIT Since X is the only explanatory variable at the moment, the null hypothesis is that Y is not determined by X. Mathematically, we have H0: b2 = 0 4

F TEST OF GOODNESS OF FIT Hypotheses concerning goodness of fit are tested via the F statistic, defined as shown. k is the number of parameters in the regression equation, which at present is just 2. 5

F TEST OF GOODNESS OF FIT n – k is, as with the t statistic, the number of degrees of freedom (number of observations less the number of parameters estimated). For simple regression analysis, it is n – 2. 6

F TEST OF GOODNESS OF FIT The F statistic may alternatively be written in terms of R2. First divide the numerator and denominator by TSS. 7

F TEST OF GOODNESS OF FIT We can now rewrite the F statistic as shown. The R2 in the numerator comes straight from the definition of R2. 8

F TEST OF GOODNESS OF FIT It is easily demonstrated that RSS/TSS is equal to 1 – R2. 9

F TEST OF GOODNESS OF FIT F is a monotonically increasing function of R2. As R2 increases, the numerator increases and the denominator decreases, so for both of these reasons F increases. 10

F TEST OF GOODNESS OF FIT R2 Here is F plotted as a function of R2 for the case where there is 1 explanatory variable and 20 observations. 11

F TEST OF GOODNESS OF FIT R2 If the null hypothesis is true, F will have a random distribution. 12

F TEST OF GOODNESS OF FIT R2 There will be some critical value which it will exceed only 5 percent of the time. If we are performing a 5 percent significance test, we will reject H0 if the F statistic is greater than this critical value. 13

F TEST OF GOODNESS OF FIT R2 In the case of an F test, the critical value depends on the number of explanatory variables as well as the number of degrees of freedom. 14

F TEST OF GOODNESS OF FIT R2 For the present example, the critical value for a 5 percent significance test is 4.41, when R2 is 0.20. 15

F TEST OF GOODNESS OF FIT R2 If we wished to be more cautious, we could perform a 1 percent test. For this the critical value of F is 8.29, where R2 is 0.32. 16

F TEST OF GOODNESS OF FIT R2 If R2 is higher than 0.32, F will be higher than 8.29, and we will reject the null hypothesis at the 1 percent level. 17

F TEST OF GOODNESS OF FIT R2 Why do we perform the test indirectly, through F, instead of directly through R2? After all, it would be easy to compute the critical values of R2 from those for F. 18

F TEST OF GOODNESS OF FIT R2 The reason is that an F test can be used for several tests of analysis of variance. Rather than have a specialized table for each test, it is more convenient to have just one. 19

F TEST OF GOODNESS OF FIT Note that, for simple regression analysis, the null and alternative hypotheses are mathematically exactly the same as for a two-tailed t test. Could the F test come to a different conclusion from the t test? 20

F TEST OF GOODNESS OF FIT The answer, of course, is no. We will demonstrate that, for simple regression analysis, the F statistic is the square of the t statistic. 21

F TEST OF GOODNESS OF FIT We start by replacing ESS and RSS by their mathematical expressions. 22

F TEST OF GOODNESS OF FIT The denominator is the expression for su2, the estimator of su2, for the simple regression model. We expand the numerator using the expression for the fitted relationship. 23

F TEST OF GOODNESS OF FIT The b1 terms in the numerator cancel. The rest of the numerator can be grouped as shown. 24

F TEST OF GOODNESS OF FIT We take the b22 term out of the summation as a factor. 25

F TEST OF GOODNESS OF FIT We move the term involving X to the denominator. 26

F TEST OF GOODNESS OF FIT The denominator is the square of the standard error of b2. 27

F TEST OF GOODNESS OF FIT Hence we obtain b22 divided by the square of the standard error of b2. This is the t statistic, squared. 28

F TEST OF GOODNESS OF FIT It can also be shown that the critical value of F, at any significance level, is equal to the square of the critical value of t. We will not attempt to prove this. 29

F TEST OF GOODNESS OF FIT Since the F test is equivalent to a two-tailed t test in the simple regression model, there is no point in performing both tests. In fact, if justified, a one-tailed t test would be better than either because it is more powerful (lower risk of Type II error if H0 is false). 30

F TEST OF GOODNESS OF FIT The F test will have its own role to play when we come to multiple regression analysis. 31

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 Here is the output for the regression of hourly earnings on years of schooling for the sample of 540 respondents from the National Longitudinal Survey of Youth. 32

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 We shall check that the F statistic has been calculated correctly. The explained sum of squares (described in Stata as the model sum of squares) is 19322. 33

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 The residual sum of squares is 92689. 34

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 The number of degrees of freedom is 540 – 2 = 538. 35

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 The denominator of the expression for F is therefore 172.28. Note that this is an estimate of su2. Its square root, denoted in Stata by Root MSE, is an estimate of the standard deviation of u. 36

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 Our calculation of F agrees with that in the Stata output. 37

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 We will also check the F statistic using theexpression for it in terms of R2. We see again that it agrees. 38

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 We will also check the relationship between the F statistic and the t statistic for the slope coefficient. 39

F TEST OF GOODNESS OF FIT . reg EARNINGS S Source | SS df MS Number of obs = 540 -------------+------------------------------ F( 1, 538) = 112.15 Model | 19321.5589 1 19321.5589 Prob > F = 0.0000 Residual | 92688.6722 538 172.283777 R-squared = 0.1725 -------------+------------------------------ Adj R-squared = 0.1710 Total | 112010.231 539 207.811189 Root MSE = 13.126 ------------------------------------------------------------------------------ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765 _cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444 Obviously, this is correct as well. 40

Copyright Christopher Dougherty 2011. These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 2.7 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/. Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx or the University of London International Programmes distance learning course 20 Elements of Econometrics www.londoninternational.ac.uk/lse. 11.07.25