Unit 2 Quantitative Interpretation of Correlation Statistics Unit 2 Quantitative Interpretation of Correlation
Correlation Coefficient A Correlation Coefficient, r, is a number given to describe the strength of the correlation between two variables. As mentioned in the previous lesson, a correlation can be positive or negative; this has no effect on the correlation of the data, it simply gives us the direction of it.
Correlation Coefficient We can estimate the linear correlation coefficient, r, graphically. Drawing a fitted rectangle around the data allows you to do a graphical estimation of the correlation coefficient. The sides of the rectangle are measured and the correlation coefficient is determined using the following formula:
Correlation Coefficient Example: The scatter plot shows that the variables have a strong, negative correlation.
Regression Line A regression line is a line that best fits the data distribution. It will allow you to estimate the value of a variable, given the value of the other variable. The line can be determined graphically. This is done by drawing a line through the points, making sure to have the same number of points on each side of the line. (see the graph on the next slide) Choose any two points that fall on the line. If there are none, choose two points that are very close to the line. Determine the equation of the line, y = ax + b, using the two points that you have chosen.
Regression Line
Regression Line The regression line can also be determined using a table of values. There are two different methods: Mayer line method Median-median line method. You may be asked to determine the equation of the regression line using a specific method or by using a method of your choosing.
Mayer line method The data must be placed in order for the x-variable. Divide the data into two equal groups. Determine the mean x and mean y in each group. P1 (3.33, 37) P2 (7.56, 62)
Mayer line method Use P1 and P2 to find the equation of the regression line.
Median-median line method The data must be in order according to the x-value. Split the data up into 3 equal groups. If this cannot be done, ensure that the first and last groups have the same number of data points. Determine the median x-value in each data grouping. Determine the median y-value in each data grouping (if the y-values are not into order, make sure they are in order when determining the median). M1 (2.5, 34) M2 (5.5, 48) M3 (8, 65.5)
Median-median line method Determine the mean x and the mean y from M1, M2, and M3. This is point P. P (5.3, 49.2) Determine the slope of the regression line using M1 and M3. Using the slope found in the previous step and point P, find the equation of the regression line.