Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice.

Slides:



Advertisements
Similar presentations
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Advertisements

Inference for Regression Today we will talk about the conditions necessary to make valid inference with regression We will also discuss the various types.
Inference for Regression
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Lecture 21: Review Review a few points about regression that I went over quickly concerning coefficient of determination, regression diagnostics and transformation.
Stat 112: Lecture 15 Notes Finish Chapter 6: –Review on Checking Assumptions (Section ) –Outliers and Influential Points (Section 6.7) Homework.
Class 17: Tuesday, Nov. 9 Another example of interpreting multiple regression coefficients Steps in multiple regression analysis and example analysis Omitted.
Lecture 18: Thurs., Nov. 6th Chapters 8.3.2, 8.4, Outliers and Influential Observations Transformations Interpretation of log transformations (8.4)
Lecture 23: Tues., Dec. 2 Today: Thursday:
Class 15: Tuesday, Nov. 2 Multiple Regression (Chapter 11, Moore and McCabe).
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Stat 112: Lecture 12 Notes Fitting Curvilinear Relationships (Chapter 5): –Interpreting the slope coefficients in the log X transformation –The log Y –
Lecture 25 Regression diagnostics for the multiple linear regression model Dealing with influential observations for multiple linear regression Interaction.
January 6, morning session 1 Statistics Micro Mini Multiple Regression January 5-9, 2008 Beth Ayers.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
Stat 112: Lecture 14 Notes Finish Chapter 6:
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Stat 112 – Notes 3 Homework 1 is due at the beginning of class next Thursday.
Lecture 6 Notes Note: I will homework 2 tonight. It will be due next Thursday. The Multiple Linear Regression model (Chapter 4.1) Inferences from.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Class 10: Tuesday, Oct. 12 Hurricane data set, review of confidence intervals and hypothesis tests Confidence intervals for mean response Prediction intervals.
Lecture 20 Simple linear regression (18.6, 18.9)
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression Analysis
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
Stat Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at (skip figures 1.2 and 1.3, last.
Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.
Regression Diagnostics Checking Assumptions and Data.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Lecture 19 Transformations, Predictions after Transformations Other diagnostic tools: Residual plot for nonconstant variance, histogram to check normality.
Stat Notes 5 p-values for one-sided tests Caution about forecasting outside the range of the explanatory variable (Chapter 3.7.2) Fitting a linear.
Stat 112 Notes 11 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Friday. I will Homework 4 tonight, but it will not be due.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
8/7/2015Slide 1 Simple linear regression is an appropriate model of the relationship between two quantitative variables provided: the data satisfies the.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Review of Statistical Models and Linear Regression Concepts STAT E-150 Statistical Methods.
Stat 112 Notes 15 Today: –Outliers and influential points. Homework 4 due on Thursday.
M25- Growth & Transformations 1  Department of ISM, University of Alabama, Lesson Objectives: Recognize exponential growth or decay. Use log(Y.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
© Buddy Freeman, Independence of error assumption. In many business applications using regression, the independent variable is TIME. When the data.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/6/12 Simple Linear Regression SECTIONS 9.1, 9.3 Inference for slope (9.1)
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Stat 112 Notes 10 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Thursday.
Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Stat 112 Notes 6 Today: –Chapters 4.2 (Inferences from a Multiple Regression Analysis)
Dependent (response) Variable Independent (control) Variable Random Error XY x1x1 y1y1 x2x2 y2y2 …… xnxn ynyn Raw data: Assumption:  i ‘s are independent.
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Stat 112 Notes 11 Today: –Transformations for fitting Curvilinear Relationships (Chapter 5)
Chapter 12: Correlation and Linear Regression 1.
F73DA2 INTRODUCTORY DATA ANALYSIS ANALYSIS OF VARIANCE.
Introductory Statistics. Inference for Bivariate Data Intro to Inference in Regression Requirements for Linear Regression Linear Relationship Constant.
(Residuals and
CHAPTER 29: Multiple Regression*
Unit 3 – Linear regression
The greatest blessing in life is
Presentation transcript:

Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice midterm if there are no questions) Next Thursday: Midterm HW5 due Tuesday. I will review notes and a practice midterm to you tomorrow.

Transformations in JMP 1.Use Tukey’s Bulging rule (see handout) to determine transformations which might help. 2.After Fit Y by X, click red triangle next to Bivariate Fit and click Fit Special. Experiment with transformations suggested by Tukey’s Bulging rule. 3.Make residual plots of the residuals for transformed model vs. the original X by clicking red triangle next to Transformed Fit to … and clicking plot residuals. Choose transformations which make the residual plot have no pattern in the mean of the residuals vs. X. 4.Compare different transformations by looking for transformation with smallest root mean square error on original y-scale. If using a transformation that involves transforming y, look at root mean square error for fit measured on original scale.

` By looking at the root mean square error on the original y-scale, we see that all of the transformations improve upon the untransformed model and that the transformation to log x is by far the best.

The transformation to Log X appears to have mostly removed a trend in the mean of the residuals. This means that. There is still a problem of nonconstant variance.

How do we use the transformation? Testing for association between Y and X: If the simple linear regression model holds for f(Y) and g(X), then Y and X are associated if and only if the slope in the regression of f(Y) and g(X) does not equal zero. P-value for test that slope is zero is <.0001: Strong evidence that per capita GDP and life expectancy are associated. Prediction and mean response: What would you predict the life expectancy to be for a country with a per capita GDP of $20,000?

More Examples of finding E(Y|X) using a transformation Suppose simple linear regression model holds for : Then Suppose simple linear regression model holds for : Then

CIs for Mean Response and Prediction Intervals Note: To expand Y-axis or X-axis, right click on the X-axis, click Axis Settings and change the minimum/maximum. In order to fully see the prediction intervals, I needed to expand the Y-axis. 95% Confidence Interval for Mean Response for E(Y|X=20,000) = (76.89,80.50) 95% Prediction Interval for Y|X=20,000 = (66.42,91.69)

Another Example of Transformations: Y=Count of tree seeds, X= weight of tree

By looking at the root mean square error on the original y-scale, we see that Both of the transformations improve upon the untransformed model and that the transformation to log y and log x is by far the best.

Prediction using the log y/log x transformation What is the predicted seed count of a tree that weights 50 mg? Math trick: exp{log(y)}=y (Remember by log, we always mean the natural log, ln), i.e.,

Assumptions for linear regression and their importance to inferences InferenceAssumptions that are important Point prediction, point estimation Linearity (specification of mean of Y|X is correct), independence Confidence interval for slope, hypothesis test for slope, confidence interval for mean response Linearity, constant variance, independence, normality (only if n<30) Prediction intervalLinearity, constant variance, independence, normality

Transformations to Remedy Constant Variance and Normality Nonconstant Variance When the variance of Y|X increases with X, try transforming Y to log Y or Y to When the variance of Y|X decreases with X, try transforming Y to 1/Y or Y to Y 2 Nonnormality When the distribution of the residuals is skewed to the right, try transforming Y to log Y. When the distribution of the residuals is skewed to the left, try transforming Y to Y 2

Steps in Regression Analysis 1.Define the question of interest. Review the design of the study to see if it can answer question of interest. Correct errors in the data. 2.Explore the data using a scatterplot. 3.Fit an initial regression model (possibly using a transformation). Check the assumptions of the regression model. 4.Investigate influential points. 5.Infer answers to the questions of interest using appropriate tools (e.g., confidence intervals, hypothesis tests, prediction intervals). 6.Communicate the results to the intended audience.