Presentation is loading. Please wait.

Presentation is loading. Please wait.

7.2 Interpreting Correlations

Similar presentations


Presentation on theme: "7.2 Interpreting Correlations"— Presentation transcript:

1 7.2 Interpreting Correlations
LEARNING GOAL Be aware of important cautions concerning the interpretation of correlations, especially the effects of outliers, the effects of grouping data, and the crucial fact that correlation does not necessarily imply causality. Page 299 Copyright © 2009 Pearson Education, Inc.

2 Copyright © 2009 Pearson Education, Inc.
Beware of Outliers If you calculate the correlation coefficient for these data, you’ll find that it is a relatively high r = 0.880, suggesting a very strong correlation. Figure 7.10 Page 299 However, if you cover the data point in the upper right corner of Figure 7.10, the apparent correlation disappears. In fact, without this data point, the correlation coefficient is r = 0. Copyright © 2009 Pearson Education, Inc. Slide

3 Copyright © 2009 Pearson Education, Inc.
EXAMPLE 1 Masked Correlation You’ve conducted a study to determine how the number of calories a person consumes in a day correlates with time spent in vigorous bicycling. Your sample consisted of ten women cyclists, all of approximately the same height and weight. Over a period of two weeks, you asked each woman to record the amount of time she spent cycling each day and what she ate on each of those days. You used the eating records to calculate the calories consumed each day. Figure 7.11 shows a scatter diagram with each woman’s mean time spent cycling on the horizontal axis and mean caloric intake on the vertical axis. Do higher cycling times correspond to higher intake of calories? Page 300 Copyright © 2009 Pearson Education, Inc. Slide

4 Copyright © 2009 Pearson Education, Inc.
Solution: If you look at the data as a whole, your eye will probably tell you that there is a positive correlation in which greater cycling time tends to go with higher caloric intake. But the correlation is very weak, with a correlation coefficient of r = However, notice that two points are outliers: one representing a cyclist who cycled about a half-hour per day and consumed more than 3,000 calories, and the other representing a cyclist who cycled more than 2 hours per day on only 1,200 Page 300 calories. It’s difficult to explain the two outliers, given that all the women in the sample have similar heights and weights. Copyright © 2009 Pearson Education, Inc. Slide

5 Copyright © 2009 Pearson Education, Inc.
Solution: (cont.) We might therefore suspect that these two women either recorded their data incorrectly or were not following their usual habits during the two-week study. If we can confirm this suspicion, then we would have reason to delete the two data points as invalid. Figure 7.12 shows that the correlation is quite strong without those two outlier points, and suggests that the number of calories consumed rises by a little more than 500 calories for each hour of cycling. Figure 7.12 The data from Figure 7.11 without the two outliers. Page 300 Of course, we should not remove the outliers without confirming our suspicion that they were invalid data points, and we should report our reasons for leaving them out. Copyright © 2009 Pearson Education, Inc. Slide

6 Copyright © 2009 Pearson Education, Inc.
Beware of Inappropriate Grouping Correlations can also be misinterpreted when data are grouped inappropriately. In some cases, grouping data hides correlations. Consider a (hypothetical) study in which researchers seek a correlation between hours of TV watched per week and high school grade point average (GPA). They collect the 21 data pairs in Table 7.3. The scatter diagram (Figure 7.13) shows virtually no correlation; the correlation Page 301 coefficient for the data is about r = The apparent conclusion is that TV viewing habits are unrelated to academic achievement. Figure 7.13 Copyright © 2009 Pearson Education, Inc. Slide

7 Copyright © 2009 Pearson Education, Inc.
However, one astute researcher realizes that some of the students watched mostly educational programs, while others tended to watch comedies, dramas, and movies. She therefore divides the data set into two groups, one for the students who watched mostly educational television and one for the other students. Table 7.4 shows her results with the students divided into these two groups. Page 301 Copyright © 2009 Pearson Education, Inc. Slide

8 Copyright © 2009 Pearson Education, Inc.
Now we find two very strong correlations (Figure 7.14): a strong positive correlation for the students who watched educational programs (r = 0.855) and a strong negative correlation for the other students (r = ). Pages Figure 7.14 These scatter diagrams show the same data as Figure 7.13, separated into the two groups identified in Table 7.4. Copyright © 2009 Pearson Education, Inc. Slide

9 Copyright © 2009 Pearson Education, Inc.
In other cases, a data set may show a stronger correlation than actually exists among subgroups. Figure 7.15 shows the scatter diagram of the (hypothetical) data collected by a consumer group studying the relationship between the weights and prices of cars. Figure 7.15 Scatter diagram for the car weight and price data. Page 302 The data set as a whole shows a strong correlation; but there is no correlation within either cluster. Copyright © 2009 Pearson Education, Inc. Slide

10 Copyright © 2009 Pearson Education, Inc.
TIME OUT TO THINK Suppose you were shopping for a compact car. If you looked at only the overall data and correlation coefficient from Figure 7.15 (previous slide), would it be reasonable to consider weight as an important factor in price? What if you looked at the data for light and heavy cars separately? Explain. Page 302 Copyright © 2009 Pearson Education, Inc. Slide

11 Copyright © 2009 Pearson Education, Inc.
Correlation Does Not Imply Causality Perhaps the most important caution about interpreting correlations is one we’ve already mentioned: Correlation does not necessarily imply causality. Possible Explanations for a Correlation The correlation may be a coincidence. Both correlation variables might be directly influenced by some common underlying cause. One of the correlated variables may actually be a cause of the other. But note that, even in this case, it may be just one of several causes. Page 303 Copyright © 2009 Pearson Education, Inc. Slide

12 Copyright © 2009 Pearson Education, Inc.
Useful Interpretations of Correlation In discussing uses of correlation that might lead to wrong interpretations, we have described the effects of outliers, inappropriate groupings, fishing for correlations, and incorrectly concluding that correlation implies causality. But there are many correct and useful interpretations of correlation. In general, correlation plays a prominent and important role in a variety of fields, including meteorology, medical research, business, economics, market research, advertising, psychology, and computer science. Page 304 Copyright © 2009 Pearson Education, Inc. Slide

13 Copyright © 2009 Pearson Education, Inc.
The End Copyright © 2009 Pearson Education, Inc. Slide


Download ppt "7.2 Interpreting Correlations"

Similar presentations


Ads by Google