Class 10: Tuesday, Oct. 12 Hurricane data set, review of confidence intervals and hypothesis tests Confidence intervals for mean response Prediction intervals.

Slides:



Advertisements
Similar presentations
Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
Advertisements

Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Lecture 21: Review Review a few points about regression that I went over quickly concerning coefficient of determination, regression diagnostics and transformation.
Class 17: Tuesday, Nov. 9 Another example of interpreting multiple regression coefficients Steps in multiple regression analysis and example analysis Omitted.
Lecture 23: Tues., Dec. 2 Today: Thursday:
Class 15: Tuesday, Nov. 2 Multiple Regression (Chapter 11, Moore and McCabe).
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Simple Linear Regression
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Stat 112 – Notes 3 Homework 1 is due at the beginning of class next Thursday.
SIMPLE LINEAR REGRESSION
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 20 Simple linear regression (18.6, 18.9)
Class 9: Thurs., Oct. 7 Inference in regression (Ch ) –Confidence intervals for slope –Hypothesis test for slope –Confidence intervals for mean.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
1 Simple Linear Regression and Correlation Chapter 17.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.
Lecture 19 Transformations, Predictions after Transformations Other diagnostic tools: Residual plot for nonconstant variance, histogram to check normality.
Stat Notes 5 p-values for one-sided tests Caution about forecasting outside the range of the explanatory variable (Chapter 3.7.2) Fitting a linear.
Stat Notes 4 Chapter 3.5 Chapter 3.7.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Stat 112 Notes 11 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Friday. I will Homework 4 tonight, but it will not be due.
Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
SIMPLE LINEAR REGRESSION
Inference for regression - Simple linear regression
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Stat 112 Notes 10 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Thursday.
Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
Chapter 8: Simple Linear Regression Yang Zhenlin.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Stat 112 Notes 6 Today: –Chapters 4.2 (Inferences from a Multiple Regression Analysis)
Linear Regression Linear Regression. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Purpose Understand Linear Regression. Use R functions.
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Stat 112 Notes 11 Today: –Transformations for fitting Curvilinear Relationships (Chapter 5)
BPS - 5th Ed. Chapter 231 Inference for Regression.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Warm-Up The least squares slope b1 is an estimate of the true slope of the line that relates global average temperature to CO2. Since b1 = is very.
Inference for Least Squares Lines
Linear Regression.
Basic Estimation Techniques
Inferences for Regression
Basic Estimation Techniques
Stat 112 Notes 4 Today: Review of p-values for one-sided tests
Basic Practice of Statistics - 3rd Edition Inference for Regression
CHAPTER 12 More About Regression
SIMPLE LINEAR REGRESSION
Inferences for Regression
Presentation transcript:

Class 10: Tuesday, Oct. 12 Hurricane data set, review of confidence intervals and hypothesis tests Confidence intervals for mean response Prediction intervals Transformations Upcoming: Thursday: Finish transformations, Example Regression Analysis Tuesday: Review for midterm Thursday: Midterm Fall Break!

Hurricane Data Is there a trend in the number of hurricanes in the Atlantic over time (possibly an increase because of global warming)? hurricane.JMP contains data on the number of hurricanes in the Atlantic basin from 1950-1997.

Inferences for Hurricane Data Residual plots and normal quantile plots indicate that assumptions of linearity, constant variance and normality in simple linear regression model are reasonable. 95% confidence interval for slope (change in mean hurricanes between year t and year t+1): (-0.086,0.012) Hypothesis Test of null hypothesis that slope equals zero: test statistic = -1.52, p-value =0.13. We accept since p-value > 0.05. No evidence of a trend in hurricanes from 1950-1997.

Scale for interpreting p-values: A large p-value is not strong evidence in favor of H0, it only shows that there is not strong evidence against H0. p-value Evidence <.01 very strong evidence against H0 .01-.05 strong evidence against H0 .05-.10 weak evidence against H0 >.1 little or no evidence against H0

Inference in Regression Confidence intervals for slope Hypothesis test for slope Confidence intervals for mean response Prediction intervals

Car Price Example A used-car dealer wants to understand how odometer reading affects the selling price of used cars. The dealer randomly selects 100 three-year old Ford Tauruses that were sold at auction during the past month. Each car was in top condition and equipped with automatic transmission, AM/FM cassette tape player and air conditioning. carprices.JMP contains the price and number of miles on the odometer of each car.

The used-car dealer has an opportunity to bid on a lot of cars offered by a rental company. The rental company has 250 Ford Tauruses, all equipped with automatic transmission, air conditioning and AM/FM cassette tape players. All of the cars in this lot have about 40,000 miles on the odometer. The dealer would like an estimate of the average selling price of all cars of this type with 40,000 miles on the odometer, i.e., E(Y|X=40,000). The least squares estimate is

Confidence Interval for Mean Response Confidence interval for E(Y|X=40,000): A range of plausible values for E(Y|X=40,000) based on the sample. Approximate 95% Confidence interval: Notes about formula for SE: Standard error becomes smaller as sample size n increases, standard error is smaller the closer is to In JMP, after Fit Line, click red triangle next to Linear Fit and click Confid Curves Fit. Use the crosshair tool by clicking Tools, Crosshair to find the exact values of the confidence interval endpoints for a given X0.

A Prediction Problem The used-car dealer is offered a particular 3-year old Ford Taurus equipped with automatic transmission, air conditioner and AM/FM cassette tape player and with 40,000 miles on the odometer. The dealer would like to predict the selling price of this particular car. Best prediction based on least squares estimate:

Range of Selling Prices for Particular Car The dealer is interested in the range of selling prices that this particular car with 40,000 miles on it is likely to have. Under simple linear regression model, Y|X follows a normal distribution with mean and standard deviation . A car with 40,000 miles on it will be in interval about 95% of the time. Class 5: We substituted the least squares estimates for for and said car with 40,000 miles on it will be in interval about 95% of the time. This is a good approximation but it ignores potential error in least square estimates.

Prediction Interval 95% Prediction Interval: An interval that has approximately a 95% chance of containing the value of Y for a particular unit with X=X0 ,where the particular unit is not in the original sample. Approximate 95% prediction interval: In JMP, after Fit Line, click red triangle next to Linear Fit and click Confid Curves Indiv. Use the crosshair tool by clicking Tools, Crosshair to find the exact values of the prediction interval endpoints for a given X0.

A Violation of Linearity Y=Life Expectancy in 1999 X=Per Capita GDP (in US Dollars) in 1999 Data in gdplife.JMP Linearity assumption of simple linear regression is clearly violated. The increase in mean life expectancy for each additional dollar of GDP is less for large GDPs than Small GDPs. Decreasing returns to increases in GDP.

Transformations Violation of linearity: E(Y|X) is not a straight line. Transformations: Perhaps E(f(Y)|g(X)) is a straight line, where f(Y) and g(X) are transformations of Y and X, and a simple linear regression model holds for the response variable f(Y) and explanatory variable g(X).

The mean of Life Expectancy | Log Per Capita appears to be approximately a straight line.

How do we use the transformation? Testing for association between Y and X: If the simple linear regression model holds for f(Y) and g(X), then Y and X are associated if and only if the slope in the regression of f(Y) and g(X) does not equal zero. P-value for test that slope is zero is <.0001: Strong evidence that per capita GDP and life expectancy are associated. Prediction and mean response: What would you predict the life expectancy to be for a country with a per capita GDP of $20,000?

How do we choose a transformation? Tukey’s Bulging Rule. See Handout. Match curvature in data to the shape of one of the curves drawn in the four quadrants of the figure in the handout. Then use the associated transformations, selecting one for either X, Y or both.

Transformations in JMP Use Tukey’s Bulging rule (see handout) to determine transformations which might help. After Fit Y by X, click red triangle next to Bivariate Fit and click Fit Special. Experiment with transformations suggested by Tukey’s Bulging rule. Make residual plots of the residuals for transformed model vs. the original X by clicking red triangle next to Transformed Fit to … and clicking plot residuals. Choose transformations which make the residual plot have no pattern in the mean of the residuals vs. X. Compare different transformations by looking for transformation with smallest root mean square error on original y-scale. If using a transformation that involves transforming y, look at root mean square error for fit measured on original scale.

` By looking at the root mean square error on the original y-scale, we see that all of the transformations improve upon the untransformed model and that the transformation to log x is by far the best.

The transformation to Log X appears to have mostly removed a trend in the mean of the residuals. This means that . There is still a problem of nonconstant variance.