3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors affecting y are uncorrelated with x, is often violated -MULTIPLE REGRESSION ANALYSIS allows us to explicitly control factors to obtain a Ceteris Paribus situation -this allows us to infer causality better than a bivariate regression
3. Multiple Regression Analysis: Estimation -multiple regression analysis includes more variables, therefore explaining more of the variation in y -multiple regression analysis can also “incorporate fairly general functional form relationships -it’s more flexible
3. Multiple Regression Analysis: Estimation 3.1 Motivation for Multiple Regression 3.2 Mechanics and Interpretation of Ordinary Least Squares 3.3 The Expected value of the OLS Estimators 3.4 The Variance of the OLS Estimators 3.5 Efficiency of OLS: The Gauss-Markov Theorem
3.1 Motivation for Multiple Regression Take the bivariate regression: -where u takes into other factors affecting movie quality, such as the characters -for this regression to be valid, we have to assume that characters are uncorrelated with the plot – a poor assumption -since u affects Plot, this estimate is biased and we can’t isolate the Ceteris Paribus effect of plot on movie quality
3.1 Motivation for Multiple Regression Take the multiple variable regression: -we still need to be concerned of u’s effect on character and plot BUT… -by including Character in the regression we ensure we can examine Plot’s effect with Character held constant (B 1 ) -We can also analyze Character’s effect on movie quality with Plot held constant (B 2 )
3.1 Motivation for Multiple Regression -”Multiple regression analysis is also useful for generalizing functional relationships between variables”: -here study time can impact exam mark in a direct and/or quadratic fashion -this quadratic equation effects how the parameters are interpreted -you cannot examine study’s effect on exammark by holding study 2 constant
3.1 Motivation for Multiple Regression -the change in exammark due to an extra hour of studying therefore becomes: -the impact is no longer a constant (B 1 ). -while including one variable twice in multiple regression analysis allows it to have a more dynamic impact, it requires a more in-depth analysis of the coefficients estimated
3.1 Motivation for Multiple Regression -A simple model with two independent variables (x 1 and x 2 ) can be written as: -where B 1 examines x 1 ’s impact on y and B 2 examines x 2 ’s impact on y -a key assumption on how u is related to x 1 and x 2 is: -that is, all unobserved impacts on y are expected to be zero given any x 1 and x 2 -as in the bivariate case, B 0 can be scaled to make this hold true
3.1 Motivation for Multiple Regression -in our movie example, this becomes: -in other words, other factors affecting movie quality (such as filming skill) are not related to plot or character -in the quadratic case, this assumption is simplified:
3.1 Model with k Independent Variables -in a regression with k independent variables, the MULTIPLE LINEAR REGRESSION MODEL or MULTIPLE REGRESSION MODEL of the population is: -B 0 is the intercept, B 1 relates to x 1, B 2 relates to x 2, and so on -k variables and an intercept give k+1 unknown parameters -parameters other than the intercept are sometimes called SLOPE PARAMETERS
3.1 Model with k Independent Variables -in the multiple regression model: -u is the error term or disturbance that captures all effects on y not included in the x’s -some effects can’t be measured -some effects aren’t expected -y is the DEPENDENT, EXPLAINED, or PREDICTED variable -x are the INDEPENDENT, EXPLANATORY or PREDICTOR variables
3.1 Model with k Independent Variables -parameter interpretation is key in multiple regressions: -here B 1 is the ceteris paribus elasticity of mark with respect to ability -if B 3 =0, then 100B 2 is approximately the ceteris paribus increase in mark when you study an extra hour -if B 3 ≠0, this is more complicated -note that this equation is linear in the parameters even though mark and study have a non-linear relationship
3.1 Model with k Independent Variables -the k assumption with k independent variables becomes: -that is, ALL unobserved factors are uncorrelated with ALL explanatory variables -anything that causes correlation between u and any explanatory variable causes (3.8) to fail
3.2 Mechanics and Interpretation of Ordinary Least Squares -in a simple model with two independent variables, the OLS estimation is written as: -where B 0 hat estimates B 0, B 1 hat estimates B 1 and B 2 hat estimates B 2 -we obtain these estimates through the method of ORDINARY LEAST SQUARES which minimizes the sum of squared residuals:
3.2 Indexing Note -when independent variables have two subscripts, the i refers to the observation number -likewise the number (1 or 2, etc.) distinguishes between different variables -for example, x 54 indicates the 5 th observations data for variable 4 -in this course, variables will be generalized x ij, where i refers to observation number and j refers to variable number -this is not universal, other papers will use different conventions
3.2 K Independent Variables -in a model with k independent variables, the OLS estimation is written as: -where B 0 hat estimates B 0, B 1 hat estimates B 1 and B 2 hat estimates B 2, etc. -this is called the OLS REGRESSION LINE or SAMPLE REGRESSION FUNCTION (SRF) -we still obtain k+1 OLS estimates by minimizing the sum of squared residuals:
3.2 K Independent Variables -using multivariable calculus (partial derivatives), this leads to k+1 equations of k+1 unknowns: -these are also OLS’s FIRST ORDER CONDITIONS (FOC’s)
3.2 K Independent Variables - these equations are sample counterparts of population moments from a method of moments estimation (we’ve omitted dividing by n) using the following assumptions: -(3.13) is tedious to solve by hand, and we use statistics and econometric software -the one requirement is that (3.13) can be solved uniquely for B j hat (this is an easy assumption) -B 0 hat is called the OLS INTERCEPT ESTIMATE and B 1 hat to B K hat the OLS SLOPE ESIMATES
3.2 Interpreting the OLS Equation -given a model with 2 independent variables (x 1 and x 2 ): -B 0 hat is the predicted value of y when x 1 =0 and x 1 =0 -this is sometimes and interesting situation and other times impossible -the intercept is still essential to the estimation, even if it is theoretically meaningless
3.2 Interpreting the OLS Equation -”B 1 hat and B 2 hat have PARTIAL EFFECT or CETERIS PARIBUS interpretations: -therefore given a change in x 1 and x 2, we can predict a change in y -in addition, when the other x variable is held constant, we have:
3.2 Interpreting Example -consider the theoretical model: -Where a person’s innate intelligence is a function of how many years a parent was home during their childhood and the average amount of hours they are held as a child -the intercept (80) estimates that a child with no stay-at home parent that is never held with have an innate intelligence of 80
3.2 Interpreting Example -consider the theoretical model: -B 1 hat estimates that a parent staying home for an extra year increases child intellect by 5 -B 2 hat estimates that a parent holding a child for on average an extra hour increases child intellect by 0.5 -if a parent stays home for an extra year, and as a result holds a child an extra hour on average, we would estimate their intellect to rise by 5.5 (5+0.5; 1(B 1 hat) + 1(B 2 hat))
3.2 Interpreting the OLS Equation -A model with k independent variables is written similar to the 2 independent variable case: -Written in terms of changes: -If we hold all other variables (x j |j=1,2…k, i≠f) fixed, or CONTROL FOR ALL other variables,
3.2 Holding Other Factors Fixed -we’ve already seen that B j hat examines the effect of increasing x j by one, holding all other x’s constant -in simple regression analysis, this would require two identical observations where only x j differed -multiple regression analysis estimates this effect without having an explicit example -multiple regression analysis mimics a controlled experiment using nonexperimental data