Chapter 8: Multiple Regression for Time Series

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Chapter 13 Multiple Regression
Chapter 13 Additional Topics in Regression Analysis
Chapter 10 Simple Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Additional Topics in Regression Analysis
The Simple Regression Model
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 10 th Edition.
Chapter 7 Forecasting with Simple Regression
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Regression and Correlation Methods Judy Zhong Ph.D.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Regression Method.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Chap 13-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 12.
Lecture 10: Correlation and Regression Model.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Linear Regression.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Chapter 13 Simple Linear Regression
Warm-Up The least squares slope b1 is an estimate of the true slope of the line that relates global average temperature to CO2. Since b1 = is very.
Chapter 15 Multiple Regression Model Building
Chapter 20 Linear and Multiple Regression
Chapter 7: Simple Linear Regression for Forecasting
Inference for Least Squares Lines
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistics for Managers using Microsoft Excel 3rd Edition
Correlation and Simple Linear Regression
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
Simple Linear Regression
Chapter 11 Simple Regression
Chapter 13 Simple Linear Regression
Pure Serial Correlation
Slides by JOHN LOUCKS St. Edward’s University.
Correlation and Simple Linear Regression
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
PENGOLAHAN DAN PENYAJIAN
Correlation and Simple Linear Regression
Undergraduated Econometrics
Multiple Regression Chapter 14.
Simple Linear Regression
Simple Linear Regression and Correlation
Product moment correlation
Chapter Fourteen McGraw-Hill/Irwin
Chapter 13 Additional Topics in Regression Analysis
St. Edward’s University
Chapter 13 Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Chapter 8: Multiple Regression for Time Series 8.1 Graphical analysis and preliminary model development 8.2 The multiple regression model 8.3 Testing the overall model 8.4 Testing individual coefficients 8.5 Checking the assumptions 8.6 Forecasting with multiple regression 8.7 Principles

8.1: Graphical Analysis and Preliminary Model Development Suppose we are interested in the level of gas prices as a function of various explanatory variables. Observe Gas Prices (=Yt) over n time periods, t = 1, 2, …, n Step 1: DDD, a time plot of Y against time Step 2: produce a scatter plot of Y against each explanatory variable Xj For step 2, identify possible variables: Personal Disposable Income Unemployment S&P 500 Index Price of crude oil Q: What other variables would be of potential interest?

8.1: Graphical Analysis and Preliminary Model Development Time Series Plot of Unleaded ©Cengage Learning 2013.

Figure 8.1: Matrix Plot for Unleaded Data shown is from file Gas_prices_1.xlsx; adapted from Minitab output.

Example 8.2: Correlation Analysis for Unleaded First row: Pearson correlation Second row: P-Value

8.2: The Multiple Regression Model Assume a linear relation between Y and X1,…,XK where: β0 = intercept (value of Y when all Xj = 0) βj = expected effect of Xj on Y, all other factors fixed ε = random error Expected value of Y given the {Xj}: So

8.2.1: The Method of Ordinary Least Squares (OLS) Define error = Observed – Fitted Estimate the intercept and slope coefficients by minimizing the sum of squared errors (SSE). That is, choose the coefficients to minimize:

8.3: Testing the Overall Model Is the overall model of value? ©Cengage Learning 2013.

8.3: Testing the Overall Model

8.3: Testing the Overall Model The decision rule for all the tests is reject H0 if p <  where p is the observed significance level, and  is the significance level used for testing, typically 0.05. The rule implies that we do not reject H0 if p > . Degrees of Freedom (DF): n = sample size K = # explanatory variables DF = n-K-1 Q: Why do we “lose” degrees of freedom?

8.3: Testing the Overall Model Overall F test--is the overall model of value? H0: all slopes are zero vs. HA: at least one slope non-zero Reject H0 if Fobserved > Ftables OR if P < α Q: What is the conclusion?

8.3.2: ANOVA in Simple Regression Summary Measures Mean Square Error: Root Mean Square Error (RMSE): Coefficient of Determination (R2): Q: Interpret S and R2

8.3.2: ANOVA in Simple Regression Summary Measures Adjusted Coefficient of Determination: Relationship between F and R2 : Q: Would a test using R2 lead to different conclusions than the F test? Why or why not?

8.4: Testing Individual Coefficients t-tests: Are individual variables worth retaining in the model, given that the other variables are already in the model? H0: slope for Xi is zero, given other X’s in model; HA: slope for Xi is not zero, given other X’s in model. In multiple regression (i.e. K > 1), F provides an overall test; the t-test gives information on individual coefficients, so that the two tests provide different information.

8.4: Testing Individual Coefficients ©Cengage Learning 2013.

Figure 8.3: Single-Variable Tests for the Unleaded The regression equation is Unleaded = 1.0137 + 0.022001 L1_ crude – 0.16398 L1_Unemp - 0.0004063 L1_SP500 + 0.00012541 L1_PDI Predictor Coef SE Coef T P Constant 1.0137 0.2163 4.69 0.000 L1_ crude 0.022001 0.001068 20.60 L1_ Unemp -0.16397 0.03752 -4.37 L1_SP500 -0.0004063 0.000136 -2.99 0.003 L1_PDI 0.00012541 0.00002324 5.40 Data shown is from file Gas_prices_1.xlsx. Q: Interpret the results

8.4.1: Case Study: Baseball Salaries Examine the results for three and five variable models What conclusions may be drawn? How could the model be improved?

8.4.1: Correlation Analysis for Baseball Salaries (A)   Salary ($000s) Years in Majors Career ERA Innings Pitched Career Wins Career Losses 1.00 0.53 -0.34 -0.22 0.27 0.11 0.09 0.51 0.89 -0.21 0.33 0.49 0.91 -0.14 0.30 0.97 Data shown is from file Baseball.xlsx; adapted from Minitab output.

8.4.1: Five Variable Model (B) Data shown is from file Baseball.xlsx; adapted from Minitab output.

8.4.1 Three Variable Model (C) Data shown is from file Baseball.xlsx; adapted from Minitab output.

8.4.2: Testing a Group of Coefficients Model M1 (with error sum of squares SSE1) Model M0 (with error sum of squares SSE0) Test H0: q+1 = q+2 = ... = K = 0, given that X1,…, Xq are in the model, against the alternative hypothesis: HA: At least one of the coefficients q+1, …, K is nonzero when X1,…, Xq are in the model.

8.4 : Testing a Group of Coefficients Example 8.4: Testing a Group of Coefficients For the baseball data, testing the three-variable model against the five-variable model: Q: What is the conclusion? Hint: The expected value of F under H0 is close to 1.0

8.5: Checking the Assumptions, I Assumption R1: For given values of the explanatory variables, X, the expected value of Y is written as E(Y|X) and has the form:   Potential Violation: We may have omitted a key variable

8.5: Checking the Assumptions, II Assumption R2: The difference between an observed Y and its expectation is known as a random error, denoted by ε. Thus, the full model may be written as: Potential Violations: Relate to the particular assumptions about the error terms

8.5: Checking the Assumptions, III Assumption R3: The expected value of each error term is zero. That is there is no bias in the measurement process. Potential Violation: Observations may contain bias. This assumption is not directly testable

8.5: Checking the Assumptions, IV Assumption R4: The errors for different observations are uncorrelated with one another. When examining observations over time, this assumption corresponds to a lack of autocorrelation among the errors. Potential Violation: The errors may be (auto)correlated.

8.5: Checking the Assumptions, V Assumption R5: The variance of the errors is constant. That is, the error terms come from distributions with equal variances. When the assumption is satisfied the error process is homoscedastic. Potential Violation: The variances are unequal; the error process is heteroscedastic.

8.5: Checking the Assumptions, VI Assumption R6: The errors are drawn from a normal distribution.  Potential Violation: The error distribution is non-normal Assumptions R3 – R6 are typically combined into the statement that the errors are independent and normally distributed with zero means and equal variances We now develop diagnostics to check whether these assumptions are reasonable. That is, do they appear to be consistent with the data?

8.5: Checking the Assumptions, VII

Figure 8.6 (A): Residuals Plots Data: Gas_prices_1.xlsx; adapted from Minitab output.

8.5.1: Analysis of Residuals for Gas Price Data Residuals appear to be approximately normal (Probability Plot and Histogram), but there are some outliers Check the original data to identify the outliers and to determine possible explanations Model does not capture time dependence Zig-zag pattern in Residuals vs. Order Errors are not homoscedastic See in Residuals vs. Fitted Value Increased volatility in the later part of the series See in Residuals vs. Order Some evidence of seasonal pattern Look for peaks every 12 months in Residuals vs. Order

Appendix 8A: The Durbin-Watson Statistic Checking for Autocorrelation The classical approach is to use the Durbin-Watson Statistic–tests for first-order autocorrelation. The value of D will always be between 0 and 4, inclusive. D = 0 perfect positive autocorrelation (et = et–1 for all points) D = 2 no autocorrelation D = 4 perfect negative autocorrelation (et = –et–1 for all points)

Appendix 8A: The Durbin-Watson Statistic Checking for Autocorrelation Whether the statistic D indicates significant autocorrelation depends on the sample size, n, and the number and structure of the predictors in the regression model, K. We use an approximate test that avoids the use of tables: Reject H0 if: Further, if we reject H0 and D<2, this implies positive autocorrelation [usual case in business applications]. If we reject H0 and D>2, this implies negative autocorrelation.

Appendix 8A: The Durbin-Watson Statistic Durbin-Watson Test for Gas Prices For model in Example 8.1 we obtain DW = 0.576 Lower critical value is: Clearly there is significant positive autocorrelation implying a carry-over effect from one month to the next. Q: What can we do about it?

8.5.1: Analysis of Residuals for Gas Price Data Durbin-Watson Test or ACF? The Durbin-Watson test (see Appendix 8A) examines only first order autocorrelation whereas the Autocorrelation Function (ACF) allows us to check for dependence at a range of possible lags. Define the autocorrelation at lag k as The ACF is the plot of rk against k, k=1, 2, …. The approximate DW test matches the graphical test on r1 given by the ACF

Figure 8.7: ACF and PACF for the Residuals of Four-Variable Unleaded Gas Prices Model [Minitab] Data: Gas_prices_1.xlsx; adapted from Minitab output.

8.6: Forecasting with Multiple Regression Given values of the inputs The point forecast is given by: The Prediction Interval is given by: where t denotes the appropriate percentage point from t-tables Example 8.6 K = 4 and n = 155, so DF = 150. The SE for the point forecast is found to be 0.1695. Using t0.025(150) = 1.976, we find that the 95 percent prediction interval is

8.7: Principles Aim for a relatively simple model specification Tailor the forecasting model to the horizon Identify important causal variables on the basis of the underlying theory and earlier empirical studies. Identify suitable proxy variables when the variables of interest are not available in a timely fashion. If the aim of the analysis is to provide pure forecasts know the explanatory variables in advance or be able to forecast them sufficiently well to justify their inclusion in the model. Use the method of ordinary lest squares to estimate the parameters. Update the estimates frequently.

Chapter 8: Take-Aways Start the modeling process by careful consideration of available theory and previous empirical studies Carry out a full preliminary analysis of the data to look for associations and for unusual observations Test both the overall model and the individual components Examine the validity of the underlying assumptions Make sure that the model is “sensible” with respect to the signs and magnitudes of the slope coefficients Use a hold-out sample to evaluate forecasting performance.