Chapter 3.2 Regression Wisdom.

Slides:



Advertisements
Similar presentations
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Advertisements

The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Copyright © 2010 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Chapter 9 Regression Wisdom
Chapter 3: Describing Relationships
Lesson 4.5 Topic/ Objective: To use residuals to determine how well lines of fit model data. To use linear regression to find lines of best fit. To distinguish.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 9 Regression Wisdom Copyright © 2010 Pearson Education, Inc.
Regression Wisdom Chapter 9.
Week 5 Lecture 2 Chapter 8. Regression Wisdom.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Examining Relationships
Chapter 3: Describing Relationships
Review of Chapter 3 Examining Relationships
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Residuals, Influential Points, and Outliers
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
3.3 Cautions Correlation and Regression Wisdom Correlation and regression describe ONLY LINEAR relationships Extrapolations (using data to.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Algebra Review The equation of a straight line y = mx + b
Chapter 3: Describing Relationships
Warm-up: Pg 197 #79-80 Get ready for homework questions
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapters Important Concepts and Terms
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 9 Regression Wisdom.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 9 Regression Wisdom.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Chapter 3.2 Regression Wisdom

Residuals To determine whether a linear model is appropriate, we examine the residuals. It is a good idea to look at both a histogram of the residuals and a scatterplot of the residuals versus the explanatory variable. If the histogram of the residuals has multiple modes, that may indicate that there are subgroups within the set of data. If a linear model is appropriate, the histogram should look approximately normal and the scatterplot of residuals should show no distinct pattern (random scatter).

If we see a curved relationship in the residual plot, the linear model is not appropriate.

Another type of residual plot shows the residuals versus the explanatory variable. Note that the scale on the horizontal axis is exactly the same as that of the original scatterplot of x and y.

When using a model to predict values of the response variable, two types of predictions can be made: interpolation or extrapolation. Making predictions within the given domain of x- values is called interpolation. Making predictions outside the given domain of x-values is called extrapolation. Only interpolation is reliable. Extrapolation is unreliable, because patterns can change abruptly. In the above example, we would be extrapolating if we use a model to predict the fuel efficiency of cars weighing more than 4.5 thousand pounds or less than 1.875 thousand pounds.

Even if a linear model is appropriate, remember that association does not imply causation. There may be lurking variables that account for the association. Also, if we’re plotting summary values (like the mean) instead of individual values, the association may appear stronger than it really is.

Outliers, Leverage, and Influential Points When examining a scatterplot it is important to look for any unusual points. There are three ways a point can be considered unusual. A point is unusual if it stands away from the others. This is called an outlier. A point is unusual if its x-value is far from the mean of all the x-values. This is called high leverage point . A point is unusual if, when omitted, it significantly changes the linear model. This is called influential.

Outlier, Influential, High Leverage For each scatterplot, add a point with the given characteristics (if possible): # Outlier, Influential, High Leverage

Outlier, Influential, Low Leverage For each scatterplot, add a point with the given characteristics (if possible): # Outlier, Influential, Low Leverage

Outlier, Not Influential, High Leverage For each scatterplot, add a point with the given characteristics (if possible): # Outlier, Not Influential, High Leverage

Outlier, Not Influential, Low Leverage For each scatterplot, add a point with the given characteristics (if possible): Outlier, Not Influential, Low Leverage NP

Not an Outlier, Influential, High Leverage For each scatterplot, add a point with the given characteristics (if possible): Not an Outlier, Influential, High Leverage NP

Not an Outlier, Influential, Low Leverage For each scatterplot, add a point with the given characteristics (if possible): # Not an Outlier, Influential, Low Leverage

Not an Outlier, Not Influential, High Leverage For each scatterplot, add a point with the given characteristics (if possible): Not an Outlier, Not Influential, High Leverage NP

Not an Outlier, Not Influential, Low Leverage For each scatterplot, add a point with the given characteristics (if possible): # Not an Outlier, Not Influential, Low Leverage

For each scatterplot, add a point with the given characteristics (if possible): # Outlier, No Leverage

If there is an unusual point in your scatterplot, it must be mentioned If there is an unusual point in your scatterplot, it must be mentioned. It is then wise to examine two models, one with the point included, and one with the point omitted.