Residuals, Influential Points, and Outliers

Slides:



Advertisements
Similar presentations
Residuals.
Advertisements

Scatter Diagrams and Linear Correlation
Regression Wisdom.
CHAPTER 3 Describing Relationships
C HAPTER 3: E XAMINING R ELATIONSHIPS. S ECTION 3.3: L EAST -S QUARES R EGRESSION Correlation measures the strength and direction of the linear relationship.
Relationship of two variables
Correlation with a Non - Linear Emphasis Day 2.  Correlation measures the strength of the linear association between 2 quantitative variables.  Before.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
AP Statistics Chapter 8 & 9 Day 3
Linear Regression Chapter 8.
Summarizing Bivariate Data
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Regression Regression relationship = trend + scatter
Verbal SAT vs Math SAT V: mean=596.3 st.dev=99.5 M: mean=612.2 st.dev=96.1 r = Write the equation of the LSRL Interpret the slope of this line Interpret.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
WARM-UP Do the work on the slip of paper (handout)
Creating a Residual Plot and Investigating the Correlation Coefficient.
Warm Up Feel free to share data points for your activity. Determine if the direction and strength of the correlation is as agreed for this class, for the.
Chapter 8 Linear Regression HOW CAN A MODEL BE CREATED WHICH REPRESENTS THE LINEAR RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?
CHAPTER 3 Describing Relationships
Residuals.
Chapter 9 Regression Wisdom
Regression Wisdom. Getting the “Bends”  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t.
Residuals, Influential Points, and Outliers
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Influential Points By Noelle Hodge. Does the age at which a child begins to talk predict later score on a test of mental ability? A study of the development.
MATH 2311 Section 5.4. Residuals Examples: Interpreting the Plots of Residuals The plot of the residual values against the x values can tell us a lot.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Residual Plots EXPLORING BIVARIATE DATA. STUDY GUIDE 1. Read pages 57—64 of the Exploring Bivariate Data packet.
Warm-up Get a sheet of computer paper/construction paper from the front of the room, and create your very own paper airplane. Try to create planes with.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
CHAPTER 3 Describing Relationships
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
distance prediction observed y value predicted value zero
Unit 4 Lesson 4 (5.4) Summarizing Bivariate Data
Chapter 5 Lesson 5.3 Summarizing Bivariate Data
Cautions about Correlation and Regression
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Regression and Residual Plots
Residuals Learning Target:
AP Stats: 3.3 Least-Squares Regression Line
1. Describe the Form and Direction of the Scatterplot.
Outliers… Leverage… Influential points….
Least-Squares Regression
Section 3.3 Linear Regression
AP Statistics, Section 3.3, Part 1
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
CHAPTER 3 Describing Relationships
GET OUT p.161 HW!.
Residuals and Residual Plots
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3.2 Regression Wisdom.
Chapter 9 Regression Wisdom.
Homework: PG. 204 #30, 31 pg. 212 #35,36 30.) a. Reading scores are predicted to increase by for each one-point increase in IQ. For x=90: 45.98;
Honors Statistics Review Chapters 7 & 8
Residuals and Residual Plots
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Residuals, Influential Points, and Outliers

Objective To develop an understanding of the impact of unusual features in the relationship between two quantitative variables.

Observed y – Predicted y Residual = Observed y – Predicted y for a given value of x Residuals are used in order to find the best LSRL (line of fit)

Residual Plot We use this to decide whether or not the original data actually follows a linear pattern random scatter = true linear relationship

Bad Residual Plots Curved Patterns Increasing or Decreasing spread in scatter

Properties of Residual Plots Always make your y-axis the set of residuals You may use either the x-value or the y-value for you x-axis (though minitab will use x-values as a default). In either case your graph should look the same On your graphing calculator RESID appears in the LIST menu after you have run LinReg(a + bx). Be sure to update LinReg(a + bx) for each new set of data.

Additional Items that can Influence LSRL Outliers Influential Points Leverage

Outliers will create large residuals Large residual changes LSRL Notice that the regression line does not change drastically by an outlier in the y-direction

Leverage: x-value far from the mean

Influential Point An observed value is said to be influential if when it is removed for the data set it would significantly change the value of the LSRL. Most texts will only use outliers with leverage in the x-direction as influential points (in the y-direction they are simply called outliers).

Note: Though it is tempting, we cannot just simply remove outliers or influential point from our data set. The best thing to do is create a LSRL for the data with this point and then without this point. Once you compare these two lines of fit, you will often learn a great deal about the data that your are trying to model.

2000 Presidential Election

Resource: http://arts.bev.net/roperldavid/politics/fl2000.htm