Remedial measures … or “how to fix problems with the model” Transforming the data so that the simple linear regression model is okay for the transformed.

Slides:



Advertisements
Similar presentations
Assumptions underlying regression analysis
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
4.1: Linearizing Data.
Correlation and regression
Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.
Chapter 11 Multiple Regression.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Simple Linear Regression Analysis
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.
Regression Diagnostics Checking Assumptions and Data.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Transforming the data Modified from: Gotelli and Allison Chapter 8; Sokal and Rohlf 2000 Chapter 13.
Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data.
Inference for regression - Simple linear regression
Model Checking Using residuals to check the validity of the linear regression model assumptions.
Chapter 12 Multiple Regression and Model Building.
Inferences for Regression
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Chapter 3: Diagnostics and Remedial Measures
12.1 Heteroskedasticity: Remedies Normality Assumption.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
© Buddy Freeman, Independence of error assumption. In many business applications using regression, the independent variable is TIME. When the data.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
Data Analysis.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Linear correlation and linear regression + summary of tests Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Residual Analysis for ANOVA Models KNNL – Chapter 18.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
ANOVA, Regression and Multiple Regression March
KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 17 Simple Linear Regression and Correlation.
Chapter 10 Notes AP Statistics. Re-expressing Data We cannot use a linear model unless the relationship between the two variables is linear. If the relationship.
732G21/732G28/732A35 Lecture 3. Properties of the model errors ε 4. ε are assumed to be normally distributed
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
Chapter 15 Multiple Regression Model Building
Simple Linear Regression
Statistical Data Analysis - Lecture /04/03
Inference for Regression
Multiple Regression.
Diagnostics and Transformation for SLR
CHAPTER 29: Multiple Regression*
Residuals The residuals are estimate of the error
Regression Models - Introduction
Statistical Assumptions for SLR
Diagnostics and Remedial Measures
Inferences for Regression
Diagnostics and Transformation for SLR
3.2. SIMPLE LINEAR REGRESSION
Diagnostics and Remedial Measures
Regression Models - Introduction
Presentation transcript:

Remedial measures … or “how to fix problems with the model” Transforming the data so that the simple linear regression model is okay for the transformed data.

Options for fixing problems with the model Abandon the simple linear regression model and find a more appropriate (but typically more complex) model. Transform the data so that the simple linear regression model works for the transformed (new) data.

Abandoning the model If not linear: try a different function, like a quadratic (Ch. 7) or an exponential function (Ch. 13). If unequal error variances: use weighted least squares (Ch. 10). If error terms are not independent: try fitting a time series model (Ch. 12). If important predictor variables omitted: try fitting a multiple regression model (Ch. 6). If outlier: use robust estimation procedure (Ch. 10).

Choices for transforming the data Transform X values only. Transform Y values only. Transform both X and Y values simultaneously.

If the only thing wrong with your model is that linear doesn’t work… Try transforming only the X values. You wouldn’t want to transform the Y values here, because you might change the well-behaved error terms (normal, equal variances) into badly-behaved error terms (not normal, unequal variances).

Example 1: Memory retention time prop Subjects asked to memorize a list of disconnected items. Asked to recall them at various times up to a week later Predictor time = time, in minutes, since initially memorized the list. Response prop = proportion of items recalled correctly.

Example 1: Fitted line plot

Example 1: Residual vs. fits plot

Example 1: Normal probability plot

Example 1: Transform the X data time prop log10_time Change (“transform”) the predictor time to log 10 (time).

Example 1: New fitted line plot

Example 1: Predicting new proportion Estimated regression function: Therefore, we predict the proportion of words recalled after 1000 days is:

Example 1: New residuals vs. fits plot

Example 1: Normal probability plot

Some possible transformations of X These are guidelines only and not complete. It usually takes some trial and error to find the best transformation.

Example 1: Time* = 1/Time

Example 1: Time* = exp(-Time)

If evidence of non-normality and unequal error variances … Since it is the shapes and spreads of the Y distributions that need to be changed, try transforming the Y values. Transformation on Y may also help “straighten out” a curved relationship. May also need to simultaneously transform the X values.

Example 2: Gestation time and birthweight for mammals Mammal Birthwgt Gestation Goat Sheep Deer Porcupine Bear Hippo Horse Camel Zebra Giraffe Elephant Predictor Birthwgt = birthweight, in kg, of mammal. Response Gestation = number of days until birth

Example 2: Fitted line plot

Example 2: Residual vs. fits plot

Example 2: Normal probability plot

Example 2: Transform the Y data Mammal Birthwgt Gestation logGest Goat Sheep Deer Porcupine Bear Hippo Horse Camel Zebra Giraffe Elephant Change (“transform”) the response Gestation to log 10 (Gestation).

Example 2: New fitted line plot

Example 2: Predicting new gestation Estimated regression function: Therefore, since: we predict the gestation length of another mammal at 50 kgs to be:

Example 2: New residual vs fits plot

Example 2: New normal probability plot

Some possible transformations of Y if not normal and unequal variances These are guidelines only. It usually takes trial and error to find the best transformation. And maybe a simultaneous transformation on X.

Example 3: Length and Weight of Alligators

Example 3: Residuals vs fits plot

Example 3: Normal probability plot

Example 3: Transform the data weight length loge_wt loge_len … and so on … Transform predictor weight to log e (weight) Transform response length to log e (length)

Example 3: New fitted line plot

Example 3: New residual plot

Example 3: New normal probability plot

Transforming data in Minitab Calc >> Calculator … In box labeled “Store result in variable,”, tell Minitab in which column (variable) you want the transformed data stored. Type (input) the expression for the desired transformation in the box labeled Expression. Use the available functions. Select okay. The data will appear in the column of the worksheet that you specified.