Download presentation
1
Qualitative Variables and
Chapter 7 Qualitative Variables and Non-Linearities in Multiple Linear Regression Analysis
2
Learning Objectives Construct and use qualitative independent variables Construct and use interaction effects Control for non-linear relationships Estimate marginal effects as percent changes and elasticities Estimate a more fully-specified model
4
Construct and Use Qualitative Independent Variables
Qualitative explanatory variable (dummy variable) with two or more levels: yes or no, on or off, male or female coded as 0 or 1 Regression intercepts are different if the variable is statistically significant Assumes equal slopes for the other variables The number of dummy variables needed is (number of levels - 1)
5
Dummy-Variable Model Example (with 2 Levels)
Let: y = pie sales x1 = price x2 = holiday (X2 = 1 if a holiday occurred during the week) (X2 = 0 if there was no holiday that week)
6
Dummy-Variable Model Example (with 2 Levels) Continued
Holiday No Holiday Different intercept Same slope y (sales) If H0: β2 = 0 is rejected, then “Holiday” has a significant effect on pie sales Holiday No Holiday x1 (Price)
7
Interpretation of the Dummy Variable Coefficient (with 2 Levels)
Example: Sales: number of pies sold per week Price: pie price in $ Holiday: 1 If a holiday occurred during the week 0 If no holiday occurred = 15: on average, sales were 15 pies greater in weeks with a holiday than in weeks without a holiday, given the same price
8
Dummy-Variable Models (more than 2 Levels)
The number of dummy variables is one less than the number of levels Example: y = house price ; x1 = square feet The style of the house is also thought to matter: Style = ranch, split level, condo Three levels, so two dummy variables are needed
9
Dummy-Variable Models (more than 2 Levels) Continued
Let the default category be “condo” shows the impact on price if the house is a ranch style, compared to a condo shows the impact on price if the house is a split level style, compared to a condo
10
Interpreting the Dummy Variable Coefficients (with 3 Levels)
Suppose the estimated equation is For a condo: x2 = x3 = 0 With the same square feet, a ranch will have an estimated average price of thousand dollars more than a condo and the intercept for a ranch is = 43.96 For a ranch: x3 = 0 With the same square feet, a ranch will have an estimated average price of thousand dollars more than a condo and the intercept for a split level is = 40.27 For a split level: x2 = 0 Same slope
11
Excel Example What type of relationship exists between energy use per capita and GDP per Capita. The initial regression is as follows: On average, if GDP per capita increases by $1000 US dollars, energy consumption per capita increases by .07 tons. This is statistically significant at the 1% level.
12
Scatter Plots of this Relationship for Europe, North America, and South America
Are the intercepts the same for these three locations?
13
Excel Example Are the intercepts different between Europe and North America with South America as the omitted group? On average, if GDP per capita increases by $1000 US dollars, energy consumption per capita increases by .07 tons. This is statistically significant at the 10% level. The dummy variables for Europe and N. America are not statistically different from S. America at the 10% level.
14
Construct and Use Interaction Effects
Interaction effects are the product of two different independent variables. We are first going to consider interaction effects between a quantitative variable and a dummy variable. This type of interaction effect changes the slope of the quantitative variable for the various levels of the dummy variable.
15
Interaction Regression Model Worksheet
multiply x1 by x2 to get x1x2, then run regression with y, x1, x2 , x1x2
16
Consider the price of the house with three levels of the dummy variable
Let the default category be “condo” and x2 is 1 if ranch and 0 if not and x3 is 1 if split level and 0 if not and x1 is square feet. shows a change in the intercept on price if the house is a ranch style, compared to a condo shows a change in the intercept on price if the house is a split level style, compared to a condo shows the impact of the slope on price if the house is a ranch style, compared to a condo shows the impact of the slope on price if the house is a split level style, compared to a condo
17
Interaction Term Worksheet
Suppose the estimated equation is
18
Visual Depiction of Interaction Terms with Dummy Variables
19
Excel Example Are the slopes and intercepts different between Europe and North America with South America as the omitted group? On average, if GDP per capita increases by $1000 US dollars, energy consumption per capita increases by .07 tons. This is statistically significant at the 10% level. The dummy variables for Europe and N. America are not statistically different from S. America at the 10% level.
20
Control for Nonlinear Relationships
The relationship between the dependent variable and an independent variable may not be linear Useful when scatter diagram indicates non-linear relationship Example: Quadratic model The second independent variable is the square of the first variable
21
Polynomial Regression Model
General form: where: β0 = Population regression constant βi = Population regression coefficient for variable xj : j = 1, 2, …k p = Order of the polynomial i = Model error If p = 2 the model is a quadratic model:
22
Linear vs. Nonlinear Fit
y y x x x x residuals residuals Linear fit does not give random residuals Nonlinear fit gives random residuals
23
Quadratic Regression Model
Quadratic models may be considered when scatter diagram takes on the following shapes: y y y y x1 x1 x1 x1 β1 < 0 β1 > 0 β1 < 0 β1 > 0 β2 > 0 β2 > 0 β2 < 0 β2 < 0 β1 = the coefficient of the linear term β2 = the coefficient of the squared term
24
Marginal Effect for the Quadratic Regression Model
How does a one unit increase in xj affect the dependent variable y (the marginal effect)? This is just a partial derivative of y with respect to xj Notice that the effect that xj has on y changes depending on the value of xj and this should be evaluated at xj-1
25
Illustration of the Marginal Effect that xj has on y
The marginal effect is the slope of a line tangent to the curve At x1j the marginal effect is positive At x2j the marginal effect is negative x1j x2j
26
Empirical Example of the Quadratic Effect: Utility Bill vs. Temperature
27
Utility Bill vs. Temperature – Simple Linear Regression
Even though the scatter plot shows a clear relationship between utility bill and temperature, there is no linear relationship between these two variables.
28
Utility Bill vs. Temperature – Quadratic Regression
When a quadratic relationship is fit between utility bill and monthly temperature the linear and quadratic terms are now statistically significant at the 1% level.
29
Utility Bill vs. Temperature – Quadratic Regression Interpretation
The marginal effect is The marginal effect at a temperature of 40 (evaluated at 39) is which means that if temperature increases from 39 to 40 degrees then the utility bill decreases by $5.06. The marginal effect at a temperature of 80 (evaluated at 79) is which means that if temperature increases from 79 to 80 degrees then the utility bill increases by $2.14.
30
Finding Where the Quadratic Function Reaches a Maximum (or Minimum)
Method: Set the first derivative of the regression equal to 0 and solve for xj. or Using the utility bill example, the function reaches a minimum at or at a temperature of degrees. The function will reach a minimum if is positive and the function will reach a maximum if is negative.
31
Testing for Significance: Quadratic Model
Test for Overall Relationship between y and xj (test if the two parameters are jointly equal to 0). Use an F-test with the Hypothesis (xj does not affect y) (xj affects y) Testing the Quadratic Effect Compare quadratic model with the linear model Use a t-test with the Hypothesis (No 2nd order polynomial term) (2nd order polynomial term is needed) H0: β1 = β2 = 0 H1: not H0 H0: β2 = 0 HA: β2 0
32
Higher Order Models y x If p = 3 the model is a cubic form:
33
Interaction Effects Hypothesizes interaction between pairs of x variables Response to one x variable varies at different levels of another x variable Contains two-way cross product terms Basic Terms Interactive Terms
34
Effect of Interaction Given:
Without interaction term, effect of x1 on y is measured by β1 With interaction term, effect of x1 on y is measured by β1 + β3 x2 Effect changes as x2 increases
35
Evaluating Presence of Interaction
Hypothesize interaction between pairs of independent variables Hypotheses: H0: β3 = 0 (no interaction between x1 and x2) HA: β3 ≠ 0 (x1 interacts with x2)
36
Estimate Marginal Effects as Percent Changes and Elasticities
The models are estimated taking natural logarithms of the dependent variable, the independent variable, or both. Log-Linear Model Log-Log Model
37
Log – Linear Model The population regression function is specified as and is interpreted as, “on average, if x1 increases by 1 unit then y increases by Note that this is only an approximation because the natural log is a nonlinear function.
38
Empirical Example of the Log – Linear Model
The dependent variable is the natural log of energy per capita This slope coefficient on gdppc is interpreted as, “on average, if GDP per capita increases by $1000 then energy consumption per capita goes up by (0.026)100% or 2.6%.” This coefficient is statistically significant at the 1% level.
39
Empirical Example of the Log – Linear Model with Dummy Variables
The dependent variable is the natural log of energy per capita with South America as the omitted group The Europe dummy variable coefficient is interpreted as “on average energy consumption per capita is 50.5% higher in Europe than South America.” The North America dummy variable coefficient is interpreted as “on average energy consumption per capita is 56.6% higher in North America than South America.” Europe is statistically insignificant while North America is marginally significant (significant at the 10% level).
40
Log – Log Model The population regression function is specified as and is interpreted as “on average, if x1 increases by 1 percent then y increases by percent.” In the log-log model is an elasticity.
41
Empirical Example of the Log – Log Model
The dependent variable is the natural log of energy per capita This slope coefficient on lngdppc is interpreted as, “on average, if GDP per capita increases by 1% then energy consumption per capita goes up by .69%.” This coefficient is statistically significant at the 1% level.
42
Empirical Example of the Log – Linear Model with Dummy Variables
The dependent variable is the natural log of energy per capita with South America as the omitted group The Europe dummy variable coefficient is interpreted as “on average energy consumption per capita is 9.3% lower in Europe than South America.” The North America dummy variable coefficient is interpreted as “on average energy consumption per capita is 41.5% higher in North America than South America.” Neither of these are statistically significant at the 10% level.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.