Lecture 21: Review Review a few points about regression that I went over quickly concerning coefficient of determination, regression diagnostics and transformation.

Slides:



Advertisements
Similar presentations
Simple linear models Straight line is simplest case, but key is that parameters appear linearly in the model Needs estimates of the model parameters (slope.
Advertisements

Simple Linear Regression 1. review of least squares procedure 2
Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
Hypothesis Testing Steps in Hypothesis Testing:
Inference for Regression
Objectives (BPS chapter 24)
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 17 Simple Linear Regression and Correlation.
Lecture 18: Thurs., Nov. 6th Chapters 8.3.2, 8.4, Outliers and Influential Observations Transformations Interpretation of log transformations (8.4)
Chapter 12 Simple Regression
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Simple Linear Regression
Lecture 25 Multiple Regression Diagnostics (Sections )
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
SIMPLE LINEAR REGRESSION
Lecture 24 Multiple Regression (Sections )
Lecture 24: Thurs., April 8th
Class 10: Tuesday, Oct. 12 Hurricane data set, review of confidence intervals and hypothesis tests Confidence intervals for mean response Prediction intervals.
Lecture 20 Simple linear regression (18.6, 18.9)
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
1 Simple Linear Regression and Correlation Chapter 17.
Lecture 23 Multiple Regression (Sections )
Introduction to Probability and Statistics Linear Regression and Correlation.
Lecture 19 Transformations, Predictions after Transformations Other diagnostic tools: Residual plot for nonconstant variance, histogram to check normality.
BCOR 1020 Business Statistics
Stat 112 Notes 11 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Friday. I will Homework 4 tonight, but it will not be due.
Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Business Statistics - QBM117 Statistical inference for regression.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 22 Regression Diagnostics.
Keller: Stats for Mgmt & Econ, 7th Ed
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Simple Linear Regression ANOVA for regression (10.2)
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Economics 173 Business Statistics Lectures Summer, 2001 Professor J. Petry.
Lecture 10: Correlation and Regression Model.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Stat 112 Notes 10 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Thursday.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
Chapter 8: Simple Linear Regression Yang Zhenlin.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Statistics for Managers Using Microsoft® Excel 5th Edition
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Linear Regression Models Andy Wang CIS Computer Systems Performance Analysis.
1 Simple Linear Regression Review 1. review of scatterplots and correlation 2. review of least squares procedure 3. inference for least squares lines.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 12: Correlation and Linear Regression 1.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Warm-Up The least squares slope b1 is an estimate of the true slope of the line that relates global average temperature to CO2. Since b1 = is very.
Inference for Least Squares Lines
Linear Regression.
CHAPTER 29: Multiple Regression*
Chapter 3 Describing Relationships Section 3.2
Presentation transcript:

Lecture 21: Review Review a few points about regression that I went over quickly concerning coefficient of determination, regression diagnostics and transformation. Review ANOVA problem. Review regression problem.

Administrative Info for Midterm II Time and Location: Wednesday, April 2, 6-8 p.m. Steinberg Hall-Dietrich Hall 351. Closed book, allowed one 8.5 x 11 double sided note sheet. Bring calculator All necessary tables will be provided but nothing additional (e.g., Tukey’s bulging rule will not be provided). Office hours: Today after class (12:10-2:30), Wednesday 9-11:30

Material Covered Focus is on Chapter 15 and Chapter 18 (we covered everything except 15.6 and 18.8) Chapters are not covered. Be prepared that questions could draw on your knowledge of material from first midterm in context of Chapter 15 and Chapter 18.

Coefficient of Determination (R 2 ) R 2 measures the strength of the linear relationship between Y and X Formulas for R 2 : –Square of correlation between X and Y (thus if Cor(X,Y)=-0.5, then R 2 =0.25) –R 2 =1-(SSE/SSTOT)=SSR/SSTOT. SSR is called sums of square due to model in JMP output. Information about SSE, SSR, SSTOT can be obtained from Analysis of Variance section of output for regression in JMP.

JMP output for Example 18.2

Impact of Large Sample Sizes R 2 will on average be the same, no matter what the sample size. However, if there is a linear relationship between X and Y, the p-value for the test for whether the slope is zero will tend to become smaller as the sample size increases. Even if the linear relationship between Y and X is weak (but the slope is not zero), the test will have a small p- value for a large sample size.

Prediction Intervals vs. Confidence Intervals Prediction Interval: Used when we want to predict one particular value of y given a specific value of x, e.g., a used car dealer wants to predict price of a particular Ford Taurus given that it has 40,000 miles. Confidence Interval for estimator of expected value of y: Used when we want to estimate the mean of y given x, e.g., a used car dealer wants to bid on a lot of 200 Ford Tauruses with 40,000 miles and wants to know the mean price of a Ford Taurus given that has 40,000 miles.

Prediction Intervals vs. Confidence Intervals Cont. As the sample size becomes large, the width of the confidence interval tends to zero but the width of the prediction interval tends to –The prediction interval –The confidence interval

Regression Assumptions and Diagnostics AssumptionDiagnostic Linearity –Residual Plot vs. X Constant variance of errors (homoskedasticity) Residual Plot vs. X NormalityHistogram of residuals IndependenceIf time series, Residual plot vs. Time

Influential Points and Outliers In addition to doing the previous diagnostics, you should check residual plots for influential points and outliers (in y, x and direction of scatterplot). Influential point: Outlier in direction of x (has high leverage) and does not fall into exactly the same pattern of relationship between y and x as the other points. Investigate whether outliers and influential points are properly recorded and are representative of the population we are interested in.

Diagnosing Nonlinearity Check residual plot vs. x to see if there is a pattern.

Transformations If there is nonlinearity, one possible way to correct for it is to apply a transformation to y or x. Tukey’s bulging rule (see handout) Match curvature in data to shape of one of the curves drawn in the four quadrants. Apply one of the transformations listed.

Tukey’s Bulging Rule Curvature appears to match top left quadrant. Try transformation to log X.