Creating a Residual Plot and Investigating the Correlation Coefficient
X Y Plot the data: The regression Line is Y= x + 1 A residual is the vertical distance from the regression line to a point off of the line. +1 residual -1 residual
Residuals X Y (residual) These are ordered pairs. I will show you another way to find these values later in the notes.
Residual Points X Y (residual) (1, -1) (2, +1) (3, 0) (4, 0) (5, +1) (5, -1) These can be plotted in Quadrants I and IV of a coordinate plane. This is called a Residual Plot
A residual plot can help us determine if the data is linear. Random points means LINEAR “U” Shaped means NOT LINEAR
Vocabulary O Residual - A residual is the difference between the observed y-value (from scatter plot) and the predicted y-value (from regression equation line). It is the vertical distance from the actual point to the point on the regression line. You can think of a residual as how far the data "fall" from the regression line (sometimes referred to as "observed error").
O Residual Plot - A residual plot is a scatter plot that shows the residuals on the vertical axis and the independent variable on the horizontal axis. The plot will help you to decide on whether a linear model is appropriate for your data.
How to create a residual plot without a calculator. X Y - Y = residual Original Data Ordered Pairs 2- = = = = = = Residual Ordered Pairs (1, -1) Predicted y values by using regression equation y= x + 1 (2, 1) (3, 0) (4, 0) (5, -1) (5, 1) Plot these Points on the residual plot
Plot the Residual Ordered Pairs (1, -1) (2, 1) (3, 0) (4, 0) (5, -1) (5, 1) X-coordinate (independent Variable) Residual is the Y-coordinate
Correlation Coefficient - The quantity r, called the linear correlation coefficient, measures the strength and the direction of a linear relationship between two variables. The linear correlation coefficient is sometimes Referred to as the Pearson product moment correlation coefficient in honor of its developer Karl Pearson.
We use a calculator to find the correlation coefficient r. The correlation coefficient r can indicate: 1)A positive correlation 2)A negative correlation 3)No correlation 4)A perfect correlation
Positive correlation - If x and y have a strong positive linear correlation, r is close to +1. An r value of exactly +1 indicates a perfect positive fit. Positive values of r indicate a relationship Between x and y variables such that as values for x increases, values for y also increase
Negative Correlation - If x and y have a strong negative linear correlation, r is close to -1. An r value of exactly -1 indicates a perfect negative fit. Negative values indicate a relationship between x and y such that as values for x increase, values for y decrease.
No correlation - If there is no linear correlation or a weak linear correlation, r is close to 0. A value near zero means that there is a random, nonlinear relationship between the two variables
A perfect correlation - of ± 1 occurs only when the data points all lie exactly on a straight line. If r = +1, the slope of this line is positive. If r = -1, the slope of this line is negative.
Determine the type of correlation each has based on the correlation coefficient r: 1. r = 0.9Because r is close to 1 it has a positive linear correlation 2. r = -0.85Because r is close to -1 it has a negative linear correlation 3. r = 0.01Because r is so close to 0, it has no linear correlation
4. Which one has the strongest linear correlation? A) 0.27B) -0.92C) 0.85D) Which r value represents a perfect linear correlation? A) 0.98B) 0C) -0.89D) -1
Closure Explain how to tell from the residual plot that the data has a linear relationship.