Download presentation
Presentation is loading. Please wait.
Published byRosa Woods Modified over 8 years ago
1
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr
2
Introduction to Regression Typically, the social scientist is dealing with multiple and complex webs of interactions between variables. An immediate and appealing extension to simple linear regression is to extend the set of explanatory variable to other variables. Multiple regressions include several explanatory variables in the empirical model
3
Introduction to Regression Typically, the social scientist is dealing with multiple and complex webs of interactions between variables. An immediate and appealing extension to simple linear regression is to extend the set of explanatory variable to other variables. Multiple regressions include several explanatory variables in the empirical model
4
To minimize the sum of squared errors
5
Multivariate Least Square Estimator Usually, the multivariate is described by matrix notation: With the following least square solution:
6
Assumption OLS 1 It is possible to operate non linear transformation of the variables (e.g. log of x ) but not of the parameters like the following : Linearity The model is linear in its parameters OLS can not estimate this
7
Assumption OLS 2 There is no selection bias in the sample. The results pertain to the whole population All observations are independent from one another (no serial nor cross-sectional correlation) Random Sampling The n observations are a random sample of the whole population
8
Assumption OLS 3 No independent variable is constant. Each variable has variance which can be used with the variance of the dependent variable to compute the parameters. No exact linear relationships amongst independent variables No perfect Collinearity There is no collinearity between independent variables
9
Assumption OLS 4 Given any values of the independent variables (IV), the error term must have an expected value of zero. In this case, all independent variables are exogenous. Otherwise, at least one IV suffers from an endogeneity problem. Zero Conditional Mean The error term u has an expected value of zero
10
Sources of endogeneity Wrong specification of the model Omitted variable correlated with one RHS. Measurement errors of RHS Mutual causation between LHS and RHS Simultaneity
11
Assumption OLS 5 Homoskedasticity The variance of the error term, u, conditional on RHS, is the same for all values of RHS. Otherwise we speak of heteroskedasticity.
12
Assumption OLS 6 Normality of error term The error term is independent of all RHS and follows a normal distribution with zero mean and variance
13
Assumptions OLS OLS1 Linearity OLS2 Random Sampling OLS3 No perfect Collinearity OLS4 Zero Conditional Mean OLS5 Homoskedasticity OLS6 Normality of error term
14
Theorem 1 OLS1 - OLS4 : Unbiasedness of OLS. The set of estimated parameters is equal to the true unknown values of
15
Theorem 2 OLS1 – OLS5 : Variance of OLS estimate. The variance of the OLS estimator is … where R² j is the R-squared from regressing x j on all other independent variables. But how can we measure ?
16
Theorem 3 OLS1 – OLS5 : The standard error of the regression is defined as This is also called the standard error of the estimate or the root mean squared errors (RMSE)
17
Standard Error of Each Parameter Combining theorems 2 and 3 yields:
18
Theorem 4 Under assumptions OLS1 – OLS5, estimators are the best linear unbiased estimators (BLUE) of Assumptions OLS1 – OLS5 are known as the Gauss- Markov Theorem, which stipulates that under OLS1-5, the OLS are the best estimation method The estimates are unbiased (OLS1-4) The estimates have the smallest variance (OLS5)
19
Theorem 5 Under assumptions OLS1 – OLS6, the OLS estimates follows a t distribution:
20
Extension of theorem 5: Inference We can define de confidence interval of β, at 95% : If the 95% CI does not include 0, then β is significantly different than 0.
21
Student t Test for H 0 : β j =0 We are also in the position to infer on β j H 0 : β j = 0 H 1 : β j ≠ 0 Rule of decision Accept H 0 is | t | < t α/2 Reject H 0 is | t | ≥ t α/2
22
Summary OLS1 Linearity OLS2 Random Sampling OLS3 No perfect Collinearity OLS4 Zero Conditional Mean OLS5 Homoskedasticity OLS6 Normality of error term T1 Unbiasedness T2-T4 BLUE T5 β ~ t
23
The knowledge production function Application 1: Seminal model
24
The knowledge production function Application 2: Changing specification
25
The knowledge production function Application 3: Adding variables
26
The knowledge production function Application 4: Dummy variables
28
Patent (lnpatent) Size (lnasset)
29
The knowledge production function Application 5: Interacting Variables
31
Application 5: Interacting variables Patent (lnpatent) Size (lnasset)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.