Residuals, Residual Plots, & Influential points. Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of.

Slides:



Advertisements
Similar presentations
Linear Regression (LSRL)
Advertisements

2nd Day: Bear Example Length (in) Weight (lb)
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
CHAPTER 3 Describing Relationships
Regression, Residuals, and Coefficient of Determination Section 3.2.
Ch 3 – Examining Relationships YMS – 3.1
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
AP Statistics Chapter 8 & 9 Day 3
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Chapter 5 Residuals, Residual Plots, & Influential points.
Chapter 5 Residuals, Residual Plots, Coefficient of determination, & Influential points.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
CHAPTER 3 Describing Relationships
LEAST-SQUARES REGRESSION 3.2 Role of s and r 2 in Regression.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Chapter 7 Linear Regression. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.2 Least-Squares Regression.
Unit 3 Correlation. Homework Assignment For the A: 1, 5, 7,11, 13, , 21, , 35, 37, 39, 41, 43, 45, 47 – 51, 55, 58, 59, 61, 63, 65, 69,
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Chapter 3 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Chapter 5 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Unit 4 LSRL.
LSRL.
Least Squares Regression Line.
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
A little VOCAB.
Chapter 5 LSRL.
LSRL Least Squares Regression Line
The scatterplot shows the advertised prices (in thousands of dollars) plotted against ages (in years) for a random sample of Plymouth Voyagers on several.
Chapter 3.2 LSRL.
Regression and Residual Plots
Residuals, Residual Plots, and Influential points
AP Stats: 3.3 Least-Squares Regression Line
Least-Squares Regression
Least Squares Regression Line LSRL Chapter 7-continued
EQ: How well does the line fit the data?
residual = observed y – predicted y residual = y - ŷ
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 2 Looking at Data— Relationships
Residuals, Residual Plots, & Influential points
Influential points.
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 LSRL.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Homework: PG. 204 #30, 31 pg. 212 #35,36 30.) a. Reading scores are predicted to increase by for each one-point increase in IQ. For x=90: 45.98;
CHAPTER 3 Describing Relationships
Presentation transcript:

Residuals, Residual Plots, & Influential points

Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of the residuals is always zero error = observed - expected

Residual plot A scatterplot of the (x, residual) pairs. Residuals can be graphed against other statistics besides x linear associationPurpose is to tell if a linear association exist between the x & y variables no pattern linearIf no pattern exists between the points in the residual plot, then the association is linear.

Linear Not linear

AgeRange of Motion One measure of the success of knee surgery is post-surgical range of motion for the knee joint following a knee dislocation. Is there a linear relationship between age & range of motion? Sketch a residual plot. Since there is no pattern in the residual plot, there is a linear relationship between age and range of motion x Residuals

AgeRange of Motion Plot the residuals against the y- hats. How does this residual plot compare to the previous one? Residuals

Residual plots are the same no matter if plotted against x or y-hat. x Residuals

Coefficient of determination- r 2 variationygives the proportion of variation in y that can be attributed to an approximate linear relationship between x & y remains the same no matter which variable is labeled x

AgeRange of Motion Let’s examine r 2. Suppose you were going to predict a future y but you didn’t know the x-value. Your best guess would be the overall mean of the existing y’s. SS y = Sum of the squared residuals (errors) using the mean of y.

AgeRange of Motion Now suppose you were going to predict a future y but you DO know the x-value. Your best guess would be the point on the LSRL for that x-value (y-hat). Sum of the squared residuals (errors) using the LSRL. SS y =

AgeRange of Motion By what percent did the sum of the squared error go down when you went from just an “overall mean” model to the “regression on x” model? SS y = SS y = This is r 2 – the amount of the variation in the y-values that is explained by the x-values.

AgeRange of Motion How well does age predict the range of motion after knee surgery? Approximately 30.6% of the variation in range of motion after knee surgery can be explained by the linear regression of age and range of motion.

Interpretation of r 2 r 2 % y xy Approximately r 2 % of the variation in y can be explained by the LSRL of x & y.

Computer-generated regression analysis of knee surgery data: PredictorCoefStdevTP Constant Age s = 10.42R-sq = 30.6%R-sq(adj) = 23.7% What is the equation of the LSRL? Find the slope & y-intercept. NEVER use adjusted r 2 ! before Be sure to convert r 2 to decimal before taking the square root! What are the correlation coefficient and the coefficient of determination?

Outlier – largeIn a regression setting, an outlier is a data point with a large residual

Influential point- A point that influences where the LSRL is located If removed, it will significantly change the slope of the LSRL

RacketResonance Acceleration (Hz) (m/sec/sec) One factor in the development of tennis elbow is the impact-induced vibration of the racket and arm at ball contact. Sketch a scatterplot of these data. Calculate the LSRL & correlation coefficient. Does there appear to be an influential point? If so, remove it and then calculate the new LSRL & correlation coefficient.

(189,30) could be influential. Remove & recalculate LSRL

(189,30) was influential since it moved the LSRL

Which of these measures are resistant? LSRL Correlation coefficient Coefficient of determination NONE NONE – all are affected by outliers