Stat 112 Notes 11 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Friday. I will e-mail Homework 4 tonight, but it will not be due.

Slides:



Advertisements
Similar presentations
Correlation and regression
Advertisements

Polynomial Regression and Transformations STA 671 Summer 2008.
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Chapter 10 Re-expressing the data
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
Lecture 21: Review Review a few points about regression that I went over quickly concerning coefficient of determination, regression diagnostics and transformation.
Stat 112: Lecture 15 Notes Finish Chapter 6: –Review on Checking Assumptions (Section ) –Outliers and Influential Points (Section 6.7) Homework.
Stat 112: Notes 1 Main topics of course: –Simple Regression –Multiple Regression –Analysis of Variance –Chapters 3-9 of textbook Readings for Notes 1:
Lecture 18: Thurs., Nov. 6th Chapters 8.3.2, 8.4, Outliers and Influential Observations Transformations Interpretation of log transformations (8.4)
Statistics for Managers Using Microsoft® Excel 5th Edition
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Stat 112: Lecture 12 Notes Fitting Curvilinear Relationships (Chapter 5): –Interpreting the slope coefficients in the log X transformation –The log Y –
Lecture 25 Regression diagnostics for the multiple linear regression model Dealing with influential observations for multiple linear regression Interaction.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Class 10: Tuesday, Oct. 12 Hurricane data set, review of confidence intervals and hypothesis tests Confidence intervals for mean response Prediction intervals.
Lecture 20 Simple linear regression (18.6, 18.9)
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.
Lecture 19 Transformations, Predictions after Transformations Other diagnostic tools: Residual plot for nonconstant variance, histogram to check normality.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
8/7/2015Slide 1 Simple linear regression is an appropriate model of the relationship between two quantitative variables provided: the data satisfies the.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
16-1 Linear Trend The long term trend of many business series often approximates a straight line.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chapter 10 Correlation and Regression
M25- Growth & Transformations 1  Department of ISM, University of Alabama, Lesson Objectives: Recognize exponential growth or decay. Use log(Y.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Chapter 10 Re-expressing Data: Get It Straight!. Slide Straight to the Point We cannot use a linear model unless the relationship between the two.
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. DosageHeart rate
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Stat 112: Notes 1 Main topics of course: –Simple Regression –Multiple Regression –Analysis of Variance –Chapters 3-9 of textbook Readings for Notes 1:
Stat 112 Notes 10 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Thursday.
Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Stat 112 Notes 6 Today: –Chapters 4.2 (Inferences from a Multiple Regression Analysis)
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Stat 112 Notes 11 Today: –Transformations for fitting Curvilinear Relationships (Chapter 5)
Chapter 4 More on Two-Variable Data. Four Corners Play a game of four corners, selecting the corner each time by rolling a die Collect the data in a table.
Statistics 10 Re-Expressing Data Get it Straight.
Chapter 15 Multiple Regression Model Building
Statistical Data Analysis - Lecture /04/03
Let’s Get It Straight! Re-expressing Data Curvilinear Regression
(Residuals and
Bell Ringer Make a scatterplot for the following data.
Lecture Slides Elementary Statistics Thirteenth Edition
Stat 112 Notes 4 Today: Review of p-values for one-sided tests
So how do we know what type of re-expression to use?
No notecard for this quiz!!
Adequacy of Linear Regression Models
Adequacy of Linear Regression Models
Algebra Review The equation of a straight line y = mx + b
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Presentation transcript:

Stat 112 Notes 11 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Friday. I will Homework 4 tonight, but it will not be due for two weeks (October 26 th ).

Curvilinear Relationships Relationship between Y and X is curvilinear if E(Y|X) is not a straight line. Linearity for simple linear regression model is violated for a curvilinear relationship. Approaches to estimating E(Y|X) for a curvilinear relationship –Polynomial Regression –Transformations

Transformations Curvilinear relationship: E(Y|X) is not a straight line. Another approach to fitting curvilinear relationships is to transform Y or x. Transformations: Perhaps E(f(Y)|g(X)) is a straight line, where f(Y) and g(X) are transformations of Y and X, and a simple linear regression model holds for the response variable f(Y) and explanatory variable g(X).

Curvilinear Relationship Y=Life Expectancy in 1999 X=Per Capita GDP (in US Dollars) in 1999 Data in gdplife.JMP Linearity assumption of simple linear regression is clearly violated. The increase in mean life expectancy for each additional dollar of GDP is less for large GDPs than Small GDPs. Decreasing returns to increases in GDP.

The mean of Life Expectancy | Log Per Capita appears to be approximately a straight line.

How do we use the transformation? Testing for association between Y and X: If the simple linear regression model holds for f(Y) and g(X), then Y and X are associated if and only if the slope in the regression of f(Y) and g(X) does not equal zero. P-value for test that slope is zero is <.0001: Strong evidence that per capita GDP and life expectancy are associated. Prediction and mean response: What would you predict the life expectancy to be for a country with a per capita GDP of $20,000?

How do we choose a transformation? Tukey’s Bulging Rule. See Handout. Match curvature in data to the shape of one of the curves drawn in the four quadrants of the figure in the handout. Then use the associated transformations, selecting one for either X, Y or both.

Transformations in JMP 1.Use Tukey’s Bulging rule (see handout) to determine transformations which might help. 2.After Fit Y by X, click red triangle next to Bivariate Fit and click Fit Special. Experiment with transformations suggested by Tukey’s Bulging rule. 3.Make residual plots of the residuals for transformed model vs. the original X by clicking red triangle next to Transformed Fit to … and clicking plot residuals. Choose transformations which make the residual plot have no pattern in the mean of the residuals vs. X. 4.Compare different transformations by looking for transformation with smallest root mean square error on original y-scale. If using a transformation that involves transforming y, look at root mean square error for fit measured on original scale.

` By looking at the root mean square error on the original y-scale, we see that all of the transformations improve upon the untransformed model and that the transformation to log x is by far the best.

The transformation to Log X appears to have mostly removed a trend in the mean of the residuals. This means that. There is still a problem of nonconstant variance.

Comparing models for curvilinear relationships In comparing two transformations, use transformation with lower RMSE, using the fit measured on the original scale if y was transformed on the original y-scale In comparing transformations to polynomial regression models, compare RMSE of best transformation to best polynomial regression model (selected using the criterion from Note 10). If the transfomation’s RMSE is close to (e.g., within 1%) but not as small as the polynomial regression’s, it is still reasonable to use the transformation on the grounds of parsimony.

Transformations and Polynomial Regression for Display.JMP RMSE Linear51.59 log x /x Fourth order poly Fourth order polynomial is the best polynomial regression model using the criterion on slide 10 Fourth order polynomial is the best model – it has the smallest RMSE by a considerable amount (more than 1% advantage over best transformation of 1/x.

Interpreting the Coefficient on Log X

Log Transformation of Both X and Y variables It is sometimes useful to transform both the X and Y variables. A particularly common transformation is to transform X to log(X) and Y to log(Y)

Heart Disease-Wine Consumption Data (heartwine.JMP)

Evaluating Transformed Y Variable Models The log-log transformation provides slightly better predictions than the simple linear regression Model.

Interpreting Coefficients in Log-Log Models

Another interpretation of coefficients in log-log models

Another Example of Transformations: Y=Count of tree seeds, X= weight of tree

By looking at the root mean square error on the original y-scale, we see that Both of the transformations improve upon the untransformed model and that the transformation to log y and log x is by far the best.

Comparison of Transformations to Polynomials for Tree Data

Prediction using the log y/log x transformation What is the predicted seed count of a tree that weights 50 mg? Math trick: exp{log(y)}=y (Remember by log, we always mean the natural log, ln), i.e.,