Chapter 5 Transformations and Weighting to Correct Model Inadequacies

Slides:



Advertisements
Similar presentations
3.3 Hypothesis Testing in Multiple Linear Regression
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
The Multiple Regression Model.
Probability & Statistical Inference Lecture 9
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
9. SIMPLE LINEAR REGESSION AND CORRELATION
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
The Simple Regression Model
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Simple Linear Regression Analysis
Copyright © Cengage Learning. All rights reserved. 13 Nonlinear and Multiple Regression.
Inferences About Process Quality
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 7 Forecasting with Simple Regression
Transforming the data Modified from: Gotelli and Allison Chapter 8; Sokal and Rohlf 2000 Chapter 13.
Introduction to Regression Analysis, Chapter 13,
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Linear Statistical.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Computacion Inteligente Least-Square Methods for System Identification.
BPS - 5th Ed. Chapter 231 Inference for Regression.
CORRELATION-REGULATION ANALYSIS Томский политехнический университет.
Chapter Outline EMPIRICAL MODELS 11-2 SIMPLE LINEAR REGRESSION 11-3 PROPERTIES OF THE LEAST SQUARES ESTIMATORS 11-4 SOME COMMENTS ON USES OF REGRESSION.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Applied Statistics and Probability for Engineers
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistics for Managers using Microsoft Excel 3rd Edition
Transformations When do we need transformation? Situations:
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Hypothesis testing and Estimation
Undergraduated Econometrics
Simple Linear Regression
CHAPTER 12 More About Regression
Essentials of Statistics for Business and Economics (8e)
Model Adequacy Checking
Regression Models - Introduction
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Chapter 5 Transformations and Weighting to Correct Model Inadequacies Ray-Bing Chen Institute of Statistics National University of Kaohsiung

5.1 Introduction Recall several implicit assumptions: The model errors have mean zero and constant variance and are uncorrelated. The model errors have a normal distribution The form of the model, including the specification of the regressors, is correct. Plots of residuals are very powerful methods for detecting violations of these basic regression assumptions.

In this chapter, we focus on methods and procedures for building regression models when some of the above assumptions are violated. We place considerable emphasis on data transformation. The method of weighted least squares is also useful in building regression model in situations where some of the underlying assumptions are violated.

5.2 Variance-stabilizing Transformations The assumption of constant variance is a basic requirement of regression analysis. A common reason for the violation of this assumption is for the response variable y to follow a probability distribution in which the variance is functionally related to the mean. For example: Poisson r.v.

Several common used variance-stabilizing transformations are in Table 5.1

The strength of a transformations depends on the amount of curvature that it induces. Sometimes we can use prior experience or theoretical considerations to guide us in selecting an appropriate transformation. In these caes the appropriate transformation must be selected empirically. If we do not correct the non-constant error variance problem, then the least-squares estimators will still be unbiased, but they will no longer have the minimum variance property. That is the regression coefficients will have larger standard errors than necessary.

When the response variable has been reexpressed, the predicted values are in the transformed scale. The predicted values => the original units Confidence or prediction intervals Example 5.1 The Electric Utility Data: Develop a model relating peak hour demand (y) to total energy usage during the month (x). Data (Table 5.2): 53 residential customers for the month of August, 1979 Figure 5.1: the scatter plot of data

A simple linear regression model is assumed: ANOVA is shown in Table 5.3 A plot of the R-student residuals v.s. the fitted values is shown in Figure 5.2 From Figure 5.2, the residuals form an outward-opening funnel, indicating that the error variance is increasing as energy consumption increases.

Suggest The R-student values from this least-squares fit are plot against the new fitted values in Figure 5.3 From Figure 5.3, the variance should be stable.

5.3 Transformations to Linearize the Model The assumption of linear relationship between y and the regressors Nonlinearity may be detected via: Lack-of-fit test, scatter diagram, the matrix of scatter-plots, or residual plots such as the partial regression plot. Some nonlinear models are called intrinsically or transformably linear if the corresponding nonlinear function can be linearized by using a suitable transformation.

Figure 5.4 and Table 5.4

For example: Example 5.2 The Windmill Data A research engineer is investigating the use of a windmill to generate electricity. He collected the data on the DC output (y) and the corresponding wind velocity (x). Data is listed in Table 5.5. Figure 5.5 is the scatter diagogram.

From Figure 5.5, the relationship between y and x may be nonlinear. Assume the simple linear regression model: and the summary statistics for this model are R2 = 0.8745, MSRes = 0.0557 and F0 = 160.26 A plot of the residuals versus the fitted values is shown in Figure 5.6. From this plot, clearly some other model form should be considered.

According to some reasons, the new model is assumed to be y = 0 + 1 (1/x) +  Figure 5.7 is a scatter diagram with the transformed variable x’ = 1/x. The new fitted regression model is The summary statistics are R2 = 0.9800, MSRes = 0.0089 and F0 = 1128.43 Figure 5.8: R-student residuals from the transformed model v.s. the fitted values. Figure 5.9: The normal probability plot (heavy tails)

5.4 Analysis Methods for Selecting a Transformation While in many instances transformation are selected empirically, more formal, objective techniques can be applied to help specify an appropriate transformation. 5.4.1 Transformation on y: The Box-Cox Method Want to transform y to correct nonnormality and/or nonconstant variance. Power transformation: yλ

Box and Cox (1964) show how the parameters of the regression model and  can be estimated simultaneously using the method of maximum likelihood. Use Where is the geometric mean of the observations and fit the model is related to the Jocobian of the transformation converting the response variable y into

Computation Procedure: Choose  to minimize SSRes(λ) Use 10-20 values of  to compute SSRes(λ). Then plot SSRes(λ) v.s. . Finally read the value of  that minimizes SSRes(λ) from graph. A second iteration can be performed using a finer mesh of values if desired. Cannot select  by directly comparing residual sum of squares from the regressions of on x because of a different scale. Once  is selected, the analyst is free to fit the model using yλ (  0) or ln y ( = 0).

An Approximate Confidence Interval for  The C.I. can be useful in selecting the final value for . For example: if the 0.596 is the minimizing value of SSRes(λ) , but if 0.5 is in the C.I., then we would prefer choose  = 0.5. If 1 is in the C.I., then no transformation may be necessary. Maximize An approximate 100(1-)% C.I. for  is

Let can be approximated by or where  is the number of residual degrees of freedom. This is based on exp(x) = 1 + x + x2/2! +…

Example 5.3 The Electric Utility Data Use the Box-Cox procedure to select a variance-stabilizing transformation. The values of SSRes(λ) for various values of λare shown in Table 5.7 A graph of the residual sum of squares v.s.  is shown in Figure 5.10. The optimal value  = 0.5 Find an approximate 95% C.I. The critical sum of squares SS* is 104.23. Then the C.I. is [0.26,0.80].

5.4.2 Transformation on the Regressor Variables Suppose that the relationship between y and one or more of the regressor variables is nonlinear but that the usual assumptions of normally and independently distribution responses with constant variance are at least approximately satisfied. Select an appropriate transformation on the regressor variables so that the relationship between y and the transformed regressor is as simple as possible. Box and Tidwell (1962) describe an analytical procedure for determining the form of transfomation on x.

Assume that y is related to ξ= xα

Box and Tidwell (1962) note that this procedure usually converges quite rapidly and often the first-stage resultα1 is a satisfactory estimate of α. Convergence problem may be encountered in case where the error standard deviation  is large or when the range of the regressor is very samll compared to its mean. Example 5.4 The Windmill Data Figure 5.5 suggests that the relationship between y and x is not a straight line!

5.5 Generalized and Weighted Least Squares Linear regression models with nonconstant error variance can also be fitted by the method of weighted least squares. Choose weight wi  1/ Var(εi) For the simple linear regression, The normal equations:

5.5.1 Generalized Least Squares Model: Forε, assume that E(ε) = 0 and Var(ε) =σ2 V Since σ2 V is a covariance matrix, V must be nonsingular and positive definite. So there exists a matrix K such that V = K’K. K is also called the square root of V. New model: For this new model, z = K-1 y, B = K-1 X and g = K-1ε Or z = (K’)-1 y, B = (K’)-1 X and g = (K’)-1ε

S()=(y - X )’ V-1 (y - X ) E(g) = 0, Var(g) = σ2 I The least-squares functions: S()=(y - X )’ V-1 (y - X )

5.5.2 Weighted Least Squares Assume The estimation procedure is usually called weighted least squares. W = V-1 is also a diagonal matrix with diagonal elements (weights) w1, …, wn

The normal equation: The weighted least-squares estimator: Transformed set of data

5.5.3 Some Practical Issues To use weighted least-squares, the weights wi must be known! Sometimes prior knowledge or experience or information from a theoretical model can be used to determine the weights. Alternatively, residual analysis may indicate that the variance of the errors may be a function of one of the regressors, say Var(i) =2xij, i.e. wi = 1/xij In some cases yi is actually an average of ni observations at xi and if all original observations have constant variance 2, then Var(yi) = Var(i) =2/ni, i.e. wi = ni

Another possible: inversely proportional to the variances of the measurement error. Several iterations: Guess at the weights => Perform the analysis => reestimate the weights When Var() =2V and V  I, the ordinary least-square estimator is still unbiased. The corresponding covariance matrix This estimator is no longer a minimum variance estimator, because the covariance matrix of the generalized least-squares estimators gives the smaller variances for the regression coefficients.

Example 5.5 Weighted Least Squares 30 restaurants: the average monthly food sale(y) v.s. the annual advertising expenses (x)(Table 5.9) Use the ordinary least-squares, Figure 5.11: the residuals v.s. the fitted values. This figure indicates violation of the constant variance assumption. Consider the near-neighbors as the repeat points.

Let the fitted values from the above equations to be the inverse of weights. The weighted least-squares: Plot weighted residuals ( ) v.s. the fitted values ( ). See Figure 5.12 For several regressors, it is not easy to identify the near-neighbors. Be careful to check if the weights procedure is reasonable or not!