Prediction, Goodness-of-Fit, and Modeling Issues

Slides:

Advertisements

Similar presentations

Dummy Variables. Introduction Discuss the use of dummy variables in Financial Econometrics. Examine the issue of normality and the use of dummy variables.

Advertisements

Panel Data Models Prepared by Vera Tabakova, East Carolina University.

Hypothesis Testing Steps in Hypothesis Testing:

Inference for Regression

Ch.6 Simple Linear Regression: Continued

Prediction, Goodness-of-Fit, and Modeling Issues ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.

Correlation and Regression

The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.

The Simple Linear Regression Model: Specification and Estimation

Chapter 10 Simple Regression.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

The Simple Regression Model

SIMPLE LINEAR REGRESSION

Further Inference in the Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.

Simple Linear Regression Analysis

SIMPLE LINEAR REGRESSION

Ch. 14: The Multiple Regression Model building

Chapter 14 Introduction to Linear Regression and Correlation Analysis

Chapter 7 Forecasting with Simple Regression

Introduction to Regression Analysis, Chapter 13,

Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.

Lecture 5 Correlation and Regression

SIMPLE LINEAR REGRESSION

Prediction, Goodness-of-fit, and Modeling Issues

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition

Introduction to Linear Regression and Correlation Analysis

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.

Correlation and Regression

Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.

1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.

© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.

1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.

Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.

Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.

Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.

Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.

Prediction, Goodness-of-Fit, and Modeling Issues Prepared by Vera Tabakova, East Carolina University.

Lecture 10: Correlation and Regression Model.

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.

Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.

1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.

Simple Linear Regression and Correlation (Continue..,) Reference: Chapter 17 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.

Chapter 13 Simple Linear Regression

Lecture 11: Simple Linear Regression

Chapter 20 Linear and Multiple Regression

Vera Tabakova, East Carolina University

Vera Tabakova, East Carolina University

Statistics for Managers using Microsoft Excel 3rd Edition

Inference for Regression

Linear Regression and Correlation Analysis

Chapter 11: Simple Linear Regression

Chapter 13 Simple Linear Regression

Slides by JOHN LOUCKS St. Edward’s University.

Correlation and Simple Linear Regression

Further Inference in the Multiple Regression Model

Correlation and Simple Linear Regression

Undergraduated Econometrics

Review of Statistical Inference

Interval Estimation and Hypothesis Testing

SIMPLE LINEAR REGRESSION

Simple Linear Regression and Correlation

Product moment correlation

SIMPLE LINEAR REGRESSION

The Multiple Regression Model

St. Edward’s University

Presentation transcript:

Prediction, Goodness-of-Fit, and Modeling Issues Chapter 4 Prepared by Vera Tabakova, East Carolina University

4.1 Least Squares Prediction Figure 4.1 A point prediction Principles of Econometrics, 3rd Edition

4.2 Measuring Goodness-of-Fit □ Consider how well the regression line describes the data □ The R2 measures how well OLS regression line fits the data (4.7) □ We assume and separate equation (4.7) into explainable and unexplainable components. (4.8) Principles of Econometrics, 3rd Edition

4.2 Measuring Goodness-of-Fit □ Since both parts of (4.8) are unobservable, we use (4.9) where (4.9) □ Subtract the sample mean from both sides, we have (4.10) (4.10) Principles of Econometrics, 3rd Edition

4.2 Measuring Goodness-of-Fit Figure 4.3 Explained and unexplained components of yi Principles of Econometrics, 3rd Edition

4.2 Measuring Goodness-of-Fit □ If we square and sum both sides of (4.10), we have (4.11). □ Note that the cross-product term, 2 𝑦 𝑖 − 𝑦 𝑒 𝑖 , is equal to 0 by the normal equations. (4.11) □ Equation (4.11) decomposes the total sample variation in y into explained and unexplained parts. Principles of Econometrics, 3rd Edition

4.2 Measuring Goodness-of-Fit = total sum of squares = SST: a measure of total variation in y about the sample mean. = sum of squares due to the regression = SSR: a part of total variation in y about the sample mean, that is explained by, or due to, the regression. Also known as the “explained sum of squares.” = sum of squares due to error = SSE: a part of total variation in y about its mean that is not explained by the regression. Also known as the unexplained sum of squares, the residual sum of squares, or the sum of squared errors. SST = SSR + SSE Principles of Econometrics, 3rd Edition

4.2 Measuring Goodness-of-Fit The closer R2 (coefficient of determination) is to one, the closer the sample values yi are to the fitted regression equation . If R2= 1, then all the sample data fall exactly on the fitted least squares line, so SSE = 0, and the model fits the data “perfectly.” If the sample data for y and x are uncorrelated and show no linear association, then the least squares fitted line is “horizontal,” so that SSR = 0 and R2 = 0. (4.12) Principles of Econometrics, 3rd Edition

4.2 Properties of the R2 R2 measures the linear association, or goodness-of-fit, between the sample data and their predicted values. Consequently R2 is sometimes called a measure of “goodness-of-fit.” Notes; - the test of β2 addresses only the question of whether there is enough evidence to infer that a linear relationship exists. - the R2 measures the strength of that linear relationship, particularly when we compare several different models Principles of Econometrics, 3rd Edition

4.2 Properties of the R2 If R2 = 0.385, it means that 38.5% of the variation can be explained by the regression line; Microeconomic (cross section) data (ex. Household behavior) has R2 values from 0.1 to 0.4; Whereas macroeconomic (time series) data has R2 values from 0.4 to 0.9, or even higher; Adjusted R2 : where K is the total number of explanatory variables in the model, and N is the sample size. ⇒ Details will be explained in Ch. 6 Principles of Econometrics, 3rd Edition

Jarque-Bera Test for Normality 4.3 Modeling Issues Jarque-Bera Test for Normality Hypothesis tests and interval estimates for the coefficients rely on the assumption that the errors, and hence the dependent variable y, are normally distributed Are they normally distributed? We can check the distribution of the residuals using: A histogram Formal statistical test: Jarque–Bera test for normality

Residual histogram and summary statistics for food expenditure example (Fig. 4.10) 4.3 Modeling Issues

Jarque-Bera Test for Normality 4.3 Modeling Issues Jarque-Bera Test for Normality The Jarque–Bera test for normality is based on two measures, skewness and kurtosis Skewness refers to how symmetric the residuals are around zero Perfectly symmetric residuals will have a skewness of zero The skewness value for the food expenditure residuals is -0.097 Thus residuals are slightly negatively skewed = left long tail Kurtosis refers to the ‘‘peakedness’’ of the distribution. For a normal distribution, the kurtosis value is 3

Jarque-Bera Test for Normality 4.3 Modeling Issues Jarque-Bera Test for Normality The Jarque–Bera statistic is given by: where N = sample size S = skewness K = kurtosis

Jarque-Bera Test for Normality 4.3 Modeling Issues Jarque-Bera Test for Normality When the residuals are normally distributed, the Jarque–Bera statistic has a chi-squared distribution with two degrees of freedom We reject the hypothesis of normally distributed errors if a calculated value of the statistic exceeds a critical value selected from the chi-squared distribution with two degrees of freedom The 5% critical value from a χ2-distribution with two degrees of freedom is 5.99, and the 1% critical value is 9.21

Jarque-Bera Test for Normality 4.3 Modeling Issues Jarque-Bera Test for Normality For the food expenditure example, the Jarque–Bera statistic is: Because 0.063 < 5.99 there is insufficient evidence from the residuals to conclude that the normal distribution assumption is unreasonable at the 5% level of significance

Jarque-Bera Test for Normality 4.3 Modeling Issues Jarque-Bera Test for Normality We could reach the same conclusion by examining the p-value The p-value appears in Figure 4.10 described as ‘‘Probability’’ of J-B test. Thus, we also fail to reject the null hypothesis of normality on the grounds that 0.9688 > 0.05