Download presentation
Presentation is loading. Please wait.
1
Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice midterm if there are no questions) Next Thursday: Midterm HW5 due Tuesday. I will e-mail review notes and a practice midterm to you tomorrow.
2
Transformations in JMP 1.Use Tukey’s Bulging rule (see handout) to determine transformations which might help. 2.After Fit Y by X, click red triangle next to Bivariate Fit and click Fit Special. Experiment with transformations suggested by Tukey’s Bulging rule. 3.Make residual plots of the residuals for transformed model vs. the original X by clicking red triangle next to Transformed Fit to … and clicking plot residuals. Choose transformations which make the residual plot have no pattern in the mean of the residuals vs. X. 4.Compare different transformations by looking for transformation with smallest root mean square error on original y-scale. If using a transformation that involves transforming y, look at root mean square error for fit measured on original scale.
4
` By looking at the root mean square error on the original y-scale, we see that all of the transformations improve upon the untransformed model and that the transformation to log x is by far the best.
5
The transformation to Log X appears to have mostly removed a trend in the mean of the residuals. This means that. There is still a problem of nonconstant variance.
6
How do we use the transformation? Testing for association between Y and X: If the simple linear regression model holds for f(Y) and g(X), then Y and X are associated if and only if the slope in the regression of f(Y) and g(X) does not equal zero. P-value for test that slope is zero is <.0001: Strong evidence that per capita GDP and life expectancy are associated. Prediction and mean response: What would you predict the life expectancy to be for a country with a per capita GDP of $20,000?
7
More Examples of finding E(Y|X) using a transformation Suppose simple linear regression model holds for : Then Suppose simple linear regression model holds for : Then
8
CIs for Mean Response and Prediction Intervals Note: To expand Y-axis or X-axis, right click on the X-axis, click Axis Settings and change the minimum/maximum. In order to fully see the prediction intervals, I needed to expand the Y-axis. 95% Confidence Interval for Mean Response for E(Y|X=20,000) = (76.89,80.50) 95% Prediction Interval for Y|X=20,000 = (66.42,91.69)
9
Another Example of Transformations: Y=Count of tree seeds, X= weight of tree
11
By looking at the root mean square error on the original y-scale, we see that Both of the transformations improve upon the untransformed model and that the transformation to log y and log x is by far the best.
12
Prediction using the log y/log x transformation What is the predicted seed count of a tree that weights 50 mg? Math trick: exp{log(y)}=y (Remember by log, we always mean the natural log, ln), i.e.,
13
Assumptions for linear regression and their importance to inferences InferenceAssumptions that are important Point prediction, point estimation Linearity (specification of mean of Y|X is correct), independence Confidence interval for slope, hypothesis test for slope, confidence interval for mean response Linearity, constant variance, independence, normality (only if n<30) Prediction intervalLinearity, constant variance, independence, normality
14
Transformations to Remedy Constant Variance and Normality Nonconstant Variance When the variance of Y|X increases with X, try transforming Y to log Y or Y to When the variance of Y|X decreases with X, try transforming Y to 1/Y or Y to Y 2 Nonnormality When the distribution of the residuals is skewed to the right, try transforming Y to log Y. When the distribution of the residuals is skewed to the left, try transforming Y to Y 2
15
Steps in Regression Analysis 1.Define the question of interest. Review the design of the study to see if it can answer question of interest. Correct errors in the data. 2.Explore the data using a scatterplot. 3.Fit an initial regression model (possibly using a transformation). Check the assumptions of the regression model. 4.Investigate influential points. 5.Infer answers to the questions of interest using appropriate tools (e.g., confidence intervals, hypothesis tests, prediction intervals). 6.Communicate the results to the intended audience.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.