More on data transformations No recipes, but some advice.

Slides:



Advertisements
Similar presentations
Nonlinear models Hill et al Chapter 10. Types of nonlinear models Linear in the parameters. –Includes models that can be made linear by transformation:
Advertisements

Assumptions underlying regression analysis
SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
4.1: Linearizing Data.
Aqua SD m1 trends MCST delivered measured m1s counts are multiplied by m1s near term objective: describe m1s as a function of time to extrapolate into.
Transformations Getting normal or using the linear model.
Correlation and regression
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Chapter 10 Re-expressing the data
Psychology 202b Advanced Psychological Statistics, II February 15, 2011.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
Quantitative Methods Checking the models II: the other three assumptions.
Count Data Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 6: Supplemental A Slightly More Technical Discussion of Logarithms (Including the.
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Psychology 202b Advanced Psychological Statistics, II February 17, 2011.
Class 10: Tuesday, Oct. 12 Hurricane data set, review of confidence intervals and hypothesis tests Confidence intervals for mean response Prediction intervals.
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
Regression Diagnostics Checking Assumptions and Data.
Assumption and Data Transformation. Assumption of Anova The error terms are randomly, independently, and normally distributed The error terms are randomly,
Stat 112 Notes 11 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Friday. I will Homework 4 tonight, but it will not be due.
Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
8/7/2015Slide 1 Simple linear regression is an appropriate model of the relationship between two quantitative variables provided: the data satisfies the.
More problem The Box-Cox Transformation Sometimes a transformation on the response fits the model better than the original response. A commonly.
Transforming the data Modified from: Gotelli and Allison Chapter 8; Sokal and Rohlf 2000 Chapter 13.
Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data.
9 - 1 Intrinsically Linear Regression Chapter Introduction In Chapter 7 we discussed some deviations from the assumptions of the regression model.
Inference for regression - Simple linear regression
Lesson Transforming to Achieve Linearity. Knowledge Objectives Explain what is meant by transforming (re- expressing) data. Tell where y = log(x)
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable.
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Transformations. Transformations to Linearity Many non-linear curves can be put into a linear form by appropriate transformations of the either – the.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
AOV Assumption Checking and Transformations (§ )
Chapter 10 Re-expressing the data
Transformations.  Although linear regression might produce a ‘good’ fit (high r value) to a set of data, the data set may still be non-linear. To remove.
Graphing Basics. Creating a graph Draw the y-axis on the vertical axis and the X-axis on the horizontal one Label what variable is on each of the axis.
Stat 112 Notes 10 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Thursday.
Copyright © 2012 Pearson Education, Inc. All rights reserved. Chapter 8 Residual Analysis.
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.
Transformations.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Beginning Statistics Table of Contents HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
If the scatter is curved, we can straighten it Then use a linear model Types of transformations for x, y, or both: 1.Square 2.Square root 3.Log 4.Negative.
Chapter 5 Lesson 5.4 Summarizing Bivariate Data 5.4: Nonlinear Relationships and Transformations.
Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.
Occasionally, we are able to see clear violations of the constant variance assumption by looking at a residual plot - characteristic “funnel” shape… often.
Chapter 10 Notes AP Statistics. Re-expressing Data We cannot use a linear model unless the relationship between the two variables is linear. If the relationship.
Scatter Plots & Lines of Best Fit To graph and interpret pts on a scatter plot To draw & write equations of best fit lines.
Stat 112 Notes 11 Today: –Transformations for fitting Curvilinear Relationships (Chapter 5)
Topics
Nonlinear Regression Functions
Model validation and prediction
Generalized Linear Models
Active Learning Lecture Slides
(Residuals and
Transformations When do we need transformation? Situations:
Unit 3 – Linear regression
Undergraduated Econometrics
Today (2/23/16) Learning objectives:
Nonlinear Fitting.
Checking the data and assumptions before the final analysis.
Diagnostics and Remedial Measures
Diagnostics and Remedial Measures
Presentation transcript:

More on data transformations No recipes, but some advice.

If the primary problem is non-linearity, look at a scatter plot of the data to suggest plausible transformations. It is possible to use transformations other than ln(x) and ln(y).

Try fitting if the trend in your data follows either of these patterns.

Try fitting if the trend in your data follows either of these patterns.

Try fitting if the trend in your data follows either of these patterns.

Try fitting if the trend in your data follows either of these patterns.

Try fitting if the trend in your data follows any of these patterns.

If the variances are unequal and/or error terms are not normal, try a “power transformation” on y.

Family of power transformations A power transformation on y involves transforming the response by taking it to some power λ. That is: Most commonly, for interpretation reasons, λ is a number between -1 and 2, such as -1, -0.5, 0, 0.5, (1), 1.5, and 2. When λ = 0, the transformation is taken to be the natural log transformation. That is:

If the variances are unequal, try “stabilizing the variance” by transforming y.

If the response y is a Poisson count… A common (now archaic?) recommendation is to transform the response using the square root transformation: and stay within the linear regression framework. Perhaps, now, the advice should be to use Poisson regression.

If the response y is a binomial proportion... A common (now archaic?) recommendation is to transform the response using the arcsine transformation: and stay within the linear regression framework. Perhaps, now, the advice should be to use a form of logistic regression.

If the response y isn’t anything special… A common recommendation is to try the natural log transformation: Or the reciprocal transformation:

It’s okay to remove some data points to make the transformation work better. Just make sure you report the scope of the model.

It’s better to give up some model fit than to lose clear interpretations. Just make sure you report that that’s what you did.