Correlation and Causation Part II – Correlation Coefficient
This video is designed to accompany pages in Making Sense of Uncertainty Activities for Teaching Statistical Reasoning Van-Griner Publishing Company
Defining a Need The Correlation Coefficient is simply a numerical way of summarizing the relationship you’d see between two variables that you could represent with a scatterplot. Positive association. How strong is it?
Formula for “r” The Correlation Coefficient is “r” measures the strength of the linear relationship between two variables “x” and “y”.
Before we compute it … 1.It is only appropriate to compute r if the scatterplot of y versus x exhibits a linear trend 2.r will always be between -1 and 1. 3.r will be negative if the points in the scatterplot have a downward trend from left to right 4.r will be positive if the points in the scatterplot have an upward trend from left to right 5.The closer r is to 1 in absolute value the tighter the cluster of points about the linear trend and the stronger the association between x and y 6.If r is close to 0 then the association is weak.
Simple Scatterplot Moderate, positive correlation?
Compute It! SubjectAge x Glucos e Level y xyx2x2 y2y ΣΣx = 247Σy = 486 Σxy = Σx 2 = 11409Σy 2 = 40022
Scatterplots Revisited Time Spent Studying Student Grades r = 0.75 Quiz Average Final Exam Score r = 0.02 GNP per capita Life Expectancy at Birth Not appropriate to use r since plot is curved Hours Exercised LDL Levels r = Got it!
One-Sentence Reflection The correlation coefficient is the most common numerical measure of the strength of a straight line relationship between two variables that can represented by a scatterplot.