Correlation: How Strong Is the Linear Relationship? Lecture 46 Sec Mon, Apr 30, 2007
The Correlation Coefficient The correlation coefficient r is a number between –1 and +1. It measures the direction and strength of the linear relationship. If r > 0, then the relationship is positive. If r < 0, then the relationship is negative. The closer r is to +1 or –1, the stronger the relationship. The closer r is to 0, the weaker the relationship.
Strong Positive Linear Association x y In this display, r is close to +1.
Strong Positive Linear Association x y In this display, r is close to +1.
Strong Negative Linear Association In this display, r is close to –1. x y
Strong Negative Linear Association In this display, r is close to –1. x y
Almost No Linear Association In this display, r is close to 0. x y
Almost No Linear Association In this display, r is close to 0. x y
Interpretation of r
Interpretation of r Strong Negative Strong Positive
Interpretation of r Weak Negative Weak Positive
Interpretation of r No Significant Correlation
Correlation vs. Cause and Effect If the value of r is close to +1 or -1, that indicates that x is a good predictor of y. It does not indicate that x causes y (or that y causes x). The correlation coefficient alone cannot be used to determine cause and effect.
Mixing Populations Mixing nonhomogeneous groups can create a misleading correlation coefficient. Suppose we gather data on the number of hours spent watching TV each week and the child’s reading level, for 1 st, 2 nd, and 3 rd grade students.
Mixing Populations We may get the following results, suggesting a weak positive correlation. Number of hours of TV Reading level
Mixing Populations We may get the following results, suggesting a weak positive correlation. Number of hours of TV Reading level r = 0.26
Mixing Populations However, if we separate the points according to grade level, we may see a different picture. 1 st grade 2 nd grade 3 rd grade Number of hours of TV Reading level
Mixing Populations However, if we separate the points according to grade level, we may see a different picture. Number of hours of TV Reading level r 1 = -0.35
Mixing Populations However, if we separate the points according to grade level, we may see a different picture. Number of hours of TV Reading level r 2 = -0.73
Mixing Populations However, if we separate the points according to grade level, we may see a different picture. Number of hours of TV Reading level r 3 = -0.52
Calculating the Correlation Coefficient There are many formulas for r. The most basic formula is Another formula is
Example Consider again the data xy
Example Compute x, y, x 2, y 2, and xy xyx2x2 y2y2 xy
Example Then compute r.
TI-83 – Calculating r To calculate r on the TI-83, First, be sure that Diagnostic is turned on. Press CATALOG and select DiagnosticsOn. Then, follow the procedure that produces the regression line. In the same window, the TI-83 reports r 2 and r. Use the TI-83 to calculate r in the preceding example.
Example Find the correlation coefficient for the Calorie/Cholesterol data. Calories (x) Cholesterol (y)
How Does r Work? Recall the formula We will consider the numerator.
How Does r Work? Consider the Subway data: Cal (x)Chol (y)
How Does r Work? Consider the Subway data: Cal (x)Chol (y) x – x – – – – –76
How Does r Work? Consider the Subway data: Cal (x)Chol (y) x – xy – y –16– –16– –26– –16– –8 2300–76–28
How Does r Work? Consider the Subway data: Cal (x)Chol (y) x – xy – y(x – )(y – y) –16– –16– –26– –16– –8– –76–282128
How Does r Work? Consider the Subway data: Cal (x)Chol (y) x – xy – y(x – )(y – y) –16– –16– –26– –16– –8– –76–282128
How Does r Work? Consider the Subway data: Cal (x)Chol (y) x – xy – y(x – )(y – y) –16– –16– –26– –16– –8– –76–282128
How Does r Work? Consider the Subway data: Cal (x)Chol (y) x – xy – y(x – )(y – y) –16– –16– –26– –16– –8– –76–282128
How Does r Work? Calories Cholesterol
How Does r Work? Calories Cholesterol
How Does r Work? Calories Cholesterol
How Does r Work? Calories Cholesterol positive negative positive
How Does r Work? Calories Cholesterol positive negative positive