Econometrics Analysis

Slides:



Advertisements
Similar presentations
Tests of Static Asset Pricing Models
Advertisements

Part 3: Least Squares Algebra 3-1/27 Econometrics I Professor William Greene Stern School of Business Department of Economics.
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Multiple Regression Analysis
The Simple Regression Model
Structural Equation Modeling
The Multiple Regression Model.
Brief introduction on Logistic Regression
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Simple Regression. Major Questions Given an economic model involving a relationship between two economic variables, how do we go about specifying the.
The General Linear Model. The Simple Linear Model Linear Regression.
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Chapter 13 Multiple Regression
Lecture 3 Cameron Kaplan
Chapter 12 Multiple Regression
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Multiple Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 11 Multiple Regression.
The Simple Regression Model
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Part 5: Regression Algebra and Fit 5-1/34 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
Regression Analysis British Biometrician Sir Francis Galton was the one who used the term Regression in the later part of 19 century.
CORRELATION & REGRESSION
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
6. Simple Regression and OLS Estimation Chapter 6 will expand on concepts introduced in Chapter 5 to cover the following: 1) Estimating parameters using.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
EXTRA SUMS OF SQUARES  Extra sum squares  Decomposition of SSR  Usage of Extra sum of Squares  Coefficient of partial determination.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Part 3: Least Squares Algebra 3-1/58 Econometrics I Professor William Greene Stern School of Business Department of Economics.
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
6. Simple Regression and OLS Estimation
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Multiple Regression Analysis: Further Issues
Evgeniya Anatolievna Kolomak, Professor
The Simple Regression Model
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
CHAPTER 29: Multiple Regression*
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Linear regression Fitting a straight line to observations.
Undergraduated Econometrics
Some issues in multivariate regression
Chengyuan Yin School of Mathematics
Econometrics Chengyaun yin School of Mathematics SHUFE.
Econometrics Analysis
5.4 General Linear Least-Squares
Econometrics Analysis
Multiple Linear Regression
Product moment correlation
Econometrics Analysis
Chengyuan Yin School of Mathematics
The Multiple Regression Model
Chengyuan Yin School of Mathematics
Econometrics I Professor William Greene Stern School of Business
MGS 3100 Business Analysis Regression Feb 18, 2016
Econometrics I Professor William Greene Stern School of Business
Presentation transcript:

Econometrics Analysis Chengyaun yin School of Mathematics SHUFE Chengyaun yin School of Mathematics SHUFE

5. Regression Algebra and a Fit Measure Econometrics Analysis 5. Regression Algebra and a Fit Measure

The Sum of Squared Residuals b minimizes ee = (y - Xb)(y - Xb). Algebraic equivalences, at the solution b = (XX)-1Xy e’e = ye (why? e’ = y’ – b’X’) ee = yy - y’Xb = yy - bXy = ey as eX = 0 (This is the F.O.C. for least squares.)

Minimizing e’e = [e - X(d - b)] [e - X(d - b)] Any other coefficient vector has a larger sum of squares. A quick proof: d = the vector, not b u = y - Xd. Then, uu = (y - Xd)(y-Xd) = [y - Xb - X(d - b)][y - Xb - X(d - b)] = [e - X(d - b)] [e - X(d - b)] Expand to find uu = ee + (d-b)XX(d-b) > ee

Dropping a Variable An important special case. Suppose [b,c]=the regression coefficients in a regression of y on [X,z]and d is the same, but computed to force the coefficient on z to be 0. This removes z from the regression. (We’ll discuss how this is done shortly.) So, we are comparing the results that we get with and without the variable z in the equation. Results which we can show: Dropping a variable(s) cannot improve the fit - that is, reduce the sum of squares. Adding a variable(s) cannot degrade the fit - that is, increase the sum of squares. The algebraic result is on text page 31. Where u = the residual in the regression of y on [X,z] and e = the residual in the regression of y on X alone, u’u = ee – c2 (z*z*)  ee where z* = MXz. This result forms the basis of the Neyman-Pearson class of tests of the regression model.

The Fit of the Regression “Variation:” In the context of the “model” we speak of variation of a variable as movement of the variable, usually associated with (not necessarily caused by) movement of another variable. Total variation = = yM0y. M0 = I – i(i’i)-1i’ = the M matrix for X = a column of ones.

Decomposing the Variation of y Decomposition: y = Xb + e so M0y = M0Xb + M0e = M0Xb + e. (Deviations from means. Why is M0e = e? ) yM0y = b(X’ M0)(M0X)b + ee = bXM0Xb + ee. (M0 is idempotent and e’ M0X = e’X = 0.) Note that results above using M0 assume that one of the columns in X is i. (Constant term.) Total sum of squares = Regression Sum of Squares (SSR)+ Residual Sum of Squares (SSE)

Decomposing the Variation Recall the decomposition: Var[y] = Var [E[y|x]] + E[Var [ y | x ]] = Variation of the conditional mean around the overall mean + Variation around the conditional mean function.

A Fit Measure R2 = bXM0Xb/yM0y = (Very Important Result.) R2 is bounded by zero and one only if: (a) There is a constant term in X and (b) The line is computed by linear least squares.

Adding Variables R2 never falls when a z is added to the regression. A useful general result

Adding Variables to a Model

A Useful Result Squared partial correlation of an x in X with y is We will define the 't-ratio' and 'degrees of freedom' later. Note how it enters:

Partial Correlation Partial correlation is a difference in R2s. In the example above, .1880 = (.9867 - .9836) / (1 - .9836) (with approximation error). Alternatively, F/(F+32) = t2 / (t2 + d.f.) = .1880.

Comparing fits of regressions Make sure the denominator in R2 is the same - i.e., same left hand side variable. Example, linear vs. loglinear. Loglinear will almost always appear to fit better because taking logs reduces variation.

(Linearly) Transformed Data How does linear transformation affect the results of least squares? Based on X, b = (XX)-1X’y. You can show (just multiply it out), the coefficients when y is regressed on Z are c = P -1 b “Fitted value” is Zc = XPP-1b = Xb. The same!! Residuals from using Z are y - Zc = y - Xb (we just proved this.). The same!! Sum of squared residuals must be identical, as y-Xb = e = y-Zc. R2 must also be identical, as R2 = 1 - ee/y’M0y (!!).

Linear Transformation Xb is the projection of y into the column space of X. Zc is the projection of y into the column space of Z. But, since the columns of Z are just linear combinations of those of X, the column space of Z must be identical to that of X. Therefore, the projection of y into the former must be the same as the latter, which now produces the other results.) What are the practical implications of this result? Transformation does not affect the fit of a model to a body of data. Transformation does affect the “estimates.” If b is an estimate of something (), then c cannot be an estimate of  - it must be an estimate of P-1, which might have no meaning at all.

Adjusted R Squared Adjusted R2 (for degrees of freedom?) = 1 - [(n-1)/(n-K)](1 - R2) Degrees of freedom” adjustment assumes something about “unbiasedness.” The ratio is not unbiased. includes a penalty for variables that don’t add much fit. Can fall when a variable is added to the equation.

Adjusted R2 What is being adjusted? The penalty for using up degrees of freedom. = 1 - [ee/(n – K)]/[yM0y/(n-1)] uses the ratio of two ‘unbiased’ estimators. Is the ratio unbiased? = 1 – [(n-1)/(n-K)(1 – R2)] Will rise when a variable is added to the regression? is higher with z than without z if and only if the t ratio on z is larger than one in absolute value. (Proof? Any takers?)

Fit Measures Other fit measures that incorporate degrees of freedom penalties. Information Criteria Amemiya: [ee/(n – K)]  (1 + K/n) Akaike: log(ee/n) + 2K/n (based on the likelihood => no degrees of freedom correction)

Linear Least Squares Subject to Restrictions Restrictions: Theory imposes certain restrictions on parameters. Some common applications Dropping variables from the equation = certain coefficients in b forced to equal 0. (Probably the most common testing situation. “Is a certain variable significant?”) Adding up conditions: Sums of certain coefficients must equal fixed values. Adding up conditions in demand systems. Constant returns to scale in production functions. Equality restrictions: Certain coefficients must equal other coefficients. Using real vs. nominal variables in equations. Common formulation: Minimize the sum of squares, ee, subject to the linear constraint Rb = q.

Restricted Least Squares

Restricted Least Squares

Restricted LS Solution 88

Aspects of Restricted LS 1. b* = b - Cm where m = the “discrepancy vector” Rb - q. Note what happens if m = 0. What does m = 0 mean? 2. =[R(XX)-1R]-1(Rb - q) = [R(XX)-1R]-1m. When does  = 0. What does this mean? 3. Combining results: b* = b - (XX)-1R. How could b* = b?