Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation: How Strong Is the Linear Relationship? Lecture 50 Sec. 13.7 Mon, May 1, 2006.

Similar presentations


Presentation on theme: "Correlation: How Strong Is the Linear Relationship? Lecture 50 Sec. 13.7 Mon, May 1, 2006."— Presentation transcript:

1 Correlation: How Strong Is the Linear Relationship? Lecture 50 Sec. 13.7 Mon, May 1, 2006

2 The Correlation Coefficient The correlation coefficient r is a number between –1 and +1. The correlation coefficient r is a number between –1 and +1. It measures the direction and strength of the linear relationship. It measures the direction and strength of the linear relationship. If r > 0, then the relationship is positive. If r 0, then the relationship is positive. If r < 0, then the relationship is negative. The closer r is to +1 or –1, the stronger the relationship. The closer r is to +1 or –1, the stronger the relationship. The closer r is to 0, the weaker the relationship. The closer r is to 0, the weaker the relationship.

3 Strong Positive Linear Association x y In this display, r is close to +1. In this display, r is close to +1.

4 Strong Positive Linear Association x y In this display, r is close to +1. In this display, r is close to +1.

5 Strong Negative Linear Association In this display, r is close to –1. In this display, r is close to –1. x y

6 Strong Negative Linear Association In this display, r is close to –1. In this display, r is close to –1. x y

7 Almost No Linear Association In this display, r is close to 0. In this display, r is close to 0. x y

8 Almost No Linear Association In this display, r is close to 0. In this display, r is close to 0. x y

9 Correlation vs. Cause and Effect If the value of r is close to +1 or -1, that indicates that x is a good predictor of y. If the value of r is close to +1 or -1, that indicates that x is a good predictor of y. It does not indicate that x causes y. It does not indicate that x causes y. The correlation coefficient alone cannot be used to determine cause and effect. The correlation coefficient alone cannot be used to determine cause and effect.

10 Correlation vs. Cause and Effect There is good reason to believe that the size of a person’s waistline is a predictor of his performance on an algebra test (within the age range 0 – 21). Why? There is good reason to believe that the size of a person’s waistline is a predictor of his performance on an algebra test (within the age range 0 – 21). Why? However, increasing your waistline will not help you on an algebra test. However, increasing your waistline will not help you on an algebra test. Similarly, avoiding algebra is not a good way to reduce your waistline. Similarly, avoiding algebra is not a good way to reduce your waistline.

11 “Third” Variables The hidden third variable is age. The hidden third variable is age. Age causes (to some extent) the waistline to increase. Age causes (to some extent) the waistline to increase. Age causes (to some extent) a person to do better on an algebra test. Age causes (to some extent) a person to do better on an algebra test.

12 Mixing Populations Mixing nonhomogeneous groups can create a misleading correlation coefficient. Mixing nonhomogeneous groups can create a misleading correlation coefficient. Suppose we gather data on the number of hours spent watching TV each week and the child’s reading level, for 1 st, 2 nd, and 3 rd grade students. Suppose we gather data on the number of hours spent watching TV each week and the child’s reading level, for 1 st, 2 nd, and 3 rd grade students.

13 Mixing Populations We may get the following results, suggesting a weak positive correlation. We may get the following results, suggesting a weak positive correlation. Number of hours of TV Reading level

14 Mixing Populations We may get the following results, suggesting a weak positive correlation. We may get the following results, suggesting a weak positive correlation. Number of hours of TV Reading level

15 Mixing Populations However, if we separate the points according to grade level, we may see a different picture. However, if we separate the points according to grade level, we may see a different picture. Number of hours of TV Reading level 1 st grade 2 nd grade 3 rd grade

16 Mixing Populations First-grade students by themselves may indicate negative correlation. First-grade students by themselves may indicate negative correlation. Number of hours of TV Reading level 1 st grade 2 nd grade 3 rd grade

17 Mixing Populations Second-grade students by themselves may also indicate negative correlation. Second-grade students by themselves may also indicate negative correlation. Number of hours of TV Reading level 1 st grade 2 nd grade 3 rd grade

18 Mixing Populations And third-grade students by themselves may indicate negative correlation. And third-grade students by themselves may indicate negative correlation. Number of hours of TV Reading level 1 st grade 2 nd grade 3 rd grade

19 Mixing Populations So, why did the points in the aggregate indicate a positive relationship? So, why did the points in the aggregate indicate a positive relationship? Number of hours of TV Reading level 1 st grade 2 nd grade 3 rd grade

20 Calculating the Correlation Coefficient There are many formulas for r. There are many formulas for r. The most basic formula is The most basic formula is Another formula is Another formula is

21 Example Consider again the data Consider again the data xy 23 35 59 612 916

22 Example Compute  x,  y,  x 2,  y 2, and  xy Compute  x,  y,  x 2,  y 2, and  xy. 155 515 282 2545 xyx2x2 y2y2 xy 23496 3592515 59258145 6123614472 91681256144

23 Example Then compute r. Then compute r.

24 TI-83 – Calculating r To calculate r on the TI-83, To calculate r on the TI-83, First, be sure that Diagnostic is turned on. First, be sure that Diagnostic is turned on. Press CATALOG and select DiagnosticsOn. Press CATALOG and select DiagnosticsOn. Then, follow the procedure that produces the regression line. Then, follow the procedure that produces the regression line. In the same window, the TI-83 reports r 2 and r. In the same window, the TI-83 reports r 2 and r. Use the TI-83 to calculate r in the preceding example. Use the TI-83 to calculate r in the preceding example.

25 The Relationship Between b and r It turns out that there is a simple relationship between the slope b of the regression line and the correlation coefficient r. It turns out that there is a simple relationship between the slope b of the regression line and the correlation coefficient r.

26 The Relationship Between b and r In the previous example, we had s X = 2.7386 and s Y = 5.2440. In the previous example, we had s X = 2.7386 and s Y = 5.2440. We also found b = 1.9. We also found b = 1.9. Therefore, the correlation coefficient is Therefore, the correlation coefficient is


Download ppt "Correlation: How Strong Is the Linear Relationship? Lecture 50 Sec. 13.7 Mon, May 1, 2006."

Similar presentations


Ads by Google