Lesson 4.8 – Interpreting the Correlation Coefficient and Distinguishing between Correlation & Causation EQ: How do you calculate the correlation coefficient? What is the difference between correlation and causation? 4.3.2: Calculating and Interpreting the Correlation Coefficient
Vocabulary A correlation is a relationship between two events, such as x and y, where a change in one event implies a change in another event. The correlation coefficient, r, is a quantity that allows us to determine how strong this relationship is between two events. It is a value that ranges from –1 to 1. 4.3.2: Calculating and Interpreting the Correlation Coefficient
1 0 - 1 -1 -1 - 0 Correlation Coefficient The correlation coefficient only assesses the strength of a linear relationship between two variables. The correlation coefficient does not assess causation—that one event causes the other. 1 0 - 1 -1 -1 - 0 4.3.2: Calculating and Interpreting the Correlation Coefficient
Correlation vs. Causation Correlation does not imply causation, or that a change in one event causes the change in the second event. If a change in one event is responsible for a change in another event, the two events have a casual relationship, or causation. Outside factors may influence and explain a strong correlation between two events. 4.3.2: Calculating and Interpreting the Correlation Coefficient
Let’s Practice! For each scatter plot identify the correlation type and coefficient. 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice Example 1 An education research team is interested in determining if there is a relationship between a student’s vocabulary and how frequently the student reads books. The team gives 20 students a 100-question vocabulary test, and asks students to record how many books they read in the past year. The results are in the table on the next slide. Is there a linear relationship between the number of books read and test scores? Use the correlation coefficient, r, to explain your answer. 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 1, continued Books read Test score 12 23 5 8 3 15 30 19 14 36 9 56 1 13 63 25 4 6 16 78 2 42 7 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 1, continued Create a scatter plot of the data. Let the x-axis represent books read and the y-axis represent test score. Test score Books read 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 1, continued Describe the relationship between the data using the graphical representation. It appears that the higher scores were from students who read more books, but the data does not appear to lie on a line. There is not a strong linear relationship between the two events. Test score Books read 4.3.2: Calculating and Interpreting the Correlation Coefficient
✔ Guided Practice: Example 1, continued Calculate the correlation coefficient on your scientific calculator. Refer to the steps in the Key Concepts section. The correlation coefficient, r, is approximately 0.48. Use the correlation coefficient to describe the strength of the relationship between the data. A correlation coefficient of 1 indicates a strong positive correlation, and a correlation of 0 indicates no correlation. A correlation coefficient of 0.48 is about halfway between 1 and 0, and indicates that there is a weak positive linear relationship between the number of books a student read in the past year and his or her score on the vocabulary test. ✔ 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 1, continued 5. Consider the casual relationship between the two events. Determine if it is likely that the number of books read is responsible for the vocabulary test score. Since reading broadens your vocabulary, and although there are other factors related to test performance, it is likely that there is a casual relationship between the number of books read and the vocabulary test score. 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice Example 2 Alex coaches basketball, and wants to know if there is a relationship between height and free throw shooting percentage. Free throw shooting percentage is the number of free throw shots completed divided by the number of free throws shots attempted: Alex takes some notes on the players in his team, and records his results in the tables on the next two slides. 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 2, continued Height in inches Free throw % 75 28 76 25 22 70 42 67 30 72 47 80 6 79 24 71 43 69 23 40 27 10 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 2, continued Height in inches Free throw % 76 33 75 13 25 71 30 67 54 68 29 79 5 14 55 78 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 2, continued Create a scatter plot of the data. Let the x-axis represent height in inches and the y- axis represent free throw shooting percentage. Free throw percentage Height in inches 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 2, continued Describe the relationship between the data using the graphical representation. Free throw percentage Height in inches As height increases, free throw shooting Percentage decreases. It appears that there is a weak negative linear correlation between the two events. 4.3.2: Calculating and Interpreting the Correlation Coefficient
Guided Practice: Example 2, continued Calculate the correlation coefficient on your scientific calculator. Refer to the steps in the previous slides. The correlation coefficient, r, is approximately -0.727. Use the correlation coefficient to describe the strength of the relationship between the data. A correlation coefficient of –1 indicates a strong negative correlation, and a correlation of 0 indicates no correlation. A correlation coefficient of –0.727 is close to –1, and indicates that there is a strong negative linear relationship between height and free throw percentage. 4.3.2: Calculating and Interpreting the Correlation Coefficient
✔ Guided Practice: Example 2, continued 5. Consider the casual relationship, causation, between the two events. Determine if it is likely that height is responsible for the decrease in free throw shooting percentage. Even if there is a correlation between height and free throw percentage, it is not likely that height causes a basketball player to have more difficulty making free throw shots. ✔ 4.3.2: Calculating and Interpreting the Correlation Coefficient
You Try! Caitlyn thinks that there may be a relationship between class size and student performance on standardized tests. She tracks the average test performance of students from 20 different classes, and notes the number of students in each class in the table on the next slide. Is there a linear relationship between class size and average test score? Use the correlation coefficient, r, to explain your answer. 4.3.2: Calculating and Interpreting the Correlation Coefficient
You Try! Class size Avg. test score 26 28 32 33 36 25 27 30 29 21 19 38 23 41 34 17 43 37 14 42 39 31 4.3.2: Calculating and Interpreting the Correlation Coefficient
You Try! Average test score Number of students 4.3.2: Calculating and Interpreting the Correlation Coefficient
You Try! Describe the relationship between the data using the graphical representation. As the class size increases, the average test score decreases. It appears that there is a linear relationship with a negative slope between the two variables. 4.3.2: Calculating and Interpreting the Correlation Coefficient
You Try! Calculate the correlation coefficient on your graphing calculator. Refer to the steps in the Key Concepts section. The correlation coefficient, r, is approximately –0.84. 4.3.2: Calculating and Interpreting the Correlation Coefficient
You Try! Use the correlation coefficient to describe the strength of the relationship between the data. A correlation coefficient of –1 indicates a strong negative correlation, and a correlation of 0 indicates no correlation. A correlation coefficient of –0.84 is close to –1, and indicates that there is a strong negative linear relationship between class size and average test score. 4.3.2: Calculating and Interpreting the Correlation Coefficient
You Try! 5. Determine if it is likely that class size is responsible for performance on standardized tests. Smaller class sizes allow teachers to give each student more attention, however, other factors are related to test performance. But we cannot say that smaller class sizes are directly related to higher performance on standardized tests. Students perform differently on tests regardless of class size. ✔ 4.3.2: Calculating and Interpreting the Correlation Coefficient