EDRS6208 Fundamentals of Education Research 1 Lecture 5 Measuring Association and Relationships 9/18/2018 Madgerie Jameson UWI School of Education
Objectives At the end of this lecture students will be able to: Define correlations and explain how they work. Compute correlation coefficient Interpret the values of the correlation coefficient Interpret the coefficient of determination. 9/18/2018 Madgerie Jameson UWI School of Education
What are Correlations? How the value of one variable changes when the value of another variable changes. This is expressed through the computation of correlation coefficient. It Reflects the dynamic quality of the relationship between variables. Helps you understand whether the tend to move in the same or opposite direction when they change. 9/18/2018 Madgerie Jameson UWI School of Education
Correlation Coefficient A numerical index that reflects the relationship between two variables. The value of this descriptive statistic ranges between -1 and +1. They can be either Direct or positive correlation if the variables changes in the same direction. Indirect or negative correlation if the variables change in opposite directions. 9/18/2018 Madgerie Jameson UWI School of Education
Types of correlation and corresponding relationship between variables. What happens to variable x What happens to variable y Type of correlation Value Example X increases in value Y increases in value Direct or positive Positive ranging from .00 to + 1.00 The more time you spend studying the higher your test scores will be. X decreases in value Y decreases in value Positive ranging from .00 to +1.00 The less money you put in the bank the less interest you will earn. Indirect or negative Negative ranging from -1.00 to .00 The more you exercise the less you weigh. Negative ranging from -1.00 to .00 The less time you take to complete a test the more errors you will make. 9/18/2018 Madgerie Jameson UWI School of Education
Note A correlation can range from -1 to +1 The absolute value of the correlation reflects the strength of the correlation. So a correlation of -.70 is stronger than a correlation of +.50. A correlation always reflects the situation where there are at least two data points or variables per case. 9/18/2018 Madgerie Jameson UWI School of Education
Do not assign a value judgment to the sign of the correlation, e. g Do not assign a value judgment to the sign of the correlation, e.g. + being good and – being bad. It is either a direct or an indirect correlation. Pearson product-moment correlation is represented by the small letter r with a subscript representing the variables that are being correlated. rxy is the correlation between variable x and variable y rweight-height is the correlation between weight and height. rSAT*GPA is the correlation between SAT score and grade point average ( GPA). 9/18/2018 Madgerie Jameson UWI School of Education
Computing a simple Correlation Coefficient The formula for the simple Pearson product moment Correlation coefficient between a variable labelled x and y are as follows: 9/18/2018 Madgerie Jameson UWI School of Education
9/18/2018 Madgerie Jameson UWI School of Education
Where rxy is the correlation coefficient between X and Y n is the sample size X is the individual score on the x variable Y is the individual score on the Y variable XY is the product of X score times its corresponding Y score. X2 is the individual’s X score squared Y2 is the individual’s Y score squared 9/18/2018 Madgerie Jameson UWI School of Education
Example Individual X Y X2 Y2 XY 1 2 3 4 9 6 16 8 5 25 36 30 12 7 49 42 64 40 20 24 10 35 Total, sum or ∑ 54 43 320 201 247 9/18/2018 Madgerie Jameson UWI School of Education
Explanation of terms ∑X or sum of all X values is 54 ∑Y or sum of all Y values is 43 ∑X2 or the sum of each X value squared is 320 ∑Y2 or the sum of each Y value squared is 201 ∑XY or the sum of the products of X and Y is 247. 9/18/2018 Madgerie Jameson UWI School of Education
Steps List the two variables for each participant Compute the sums of all the X values and all the Y values Square each of the X values and each of the Y values. Find the sum of all XY products 9/18/2018 Madgerie Jameson UWI School of Education
Substitute Values into equation 9/18/2018 Madgerie Jameson UWI School of Education
Correlation using Excel Calculating Correlations Using Excel This tutorial teaches you how to calculate correlations using Excel. This video has been prepared by Mastee Badii as part of the course HAP 501 Business Statistics in Health Care Services. This is a course of Department of Health Administration and Policy in George Mason University. utube.com/watch?v=LTrgAQraf5Y 9/18/2018 Madgerie Jameson UWI School of Education
Correlation using SPSS Data Source SPSS 9/18/2018 Madgerie Jameson UWI School of Education
Now, we are going to use a data set of 103 students (52 Male and 51 Female) regarding the time spent in revision of subject, anxiety before the exam, and marks obtained in exam of these students. We shall calculate Pearson’ Product-Moment Correlation Coefficient for these variables. This data set has been downloaded from Andy Field’s website. Analyze Correlate Bivariate Transfer variables of interest to ‘Variables box’ Choose ‘Pearson’ in Correlation Coefficient Decide on Test of Significance: One-tailed/two-tailed OK
SPSS Output: Statistically significant Correlations are flagged by *. One * means the result is significant at .05 level and ** (two asterisk) implies that the result is significant at .01 level. Reporting the Correlation: There is a statistically significant, positive relationship between Time spent in revision by an individual and his exam performance, r = .397, p (two-tailed) < .01
Visual representation of a correlation A simple way to visually represent a correlation is to create a scatter plot or scatter gram. It shows each set of scores on separate axes. 9/18/2018 Madgerie Jameson UWI School of Education
A simple Scatter Plot for the set of 10 scores 9/18/2018 Madgerie Jameson UWI School of Education
Create a simple plot First draw X and Y Axis Mark both axes with the range of values ( X in our data goes from2 to 8 and y goes from 2 to 6. For each pair of scores enter a dot on the chart, e.g. Individual 1 would plot 2, 3 on the graph. 9/18/2018 Madgerie Jameson UWI School of Education
Meaning of slope Positive slope occurs when the data points group themselves in a cluster from the lower left-hand corner on the x and y axes through the upper right-hand corner. 9/18/2018 Madgerie Jameson UWI School of Education
Relationship between GPA and SAT scores. Positive Correlation Relationship between GPA and SAT scores. 9/18/2018 Madgerie Jameson UWI School of Education
A negative slope occurs when the data points group themselves in a cluster from the upper left-hand corner on the x and y axes thought the lower left hand corner. 9/18/2018 Madgerie Jameson UWI School of Education
Relationship between GPA and amount of drinking Negative Correlation Relationship between GPA and amount of drinking 9/18/2018 Madgerie Jameson UWI School of Education
Perfect Direct Correlation Where rxy = 1.00 and the data points are aligned along a straight line with a positive slope. 9/18/2018 Madgerie Jameson UWI School of Education
Perfect Indirect Correlation If a correlation is perfectly indirect, the value of the correlation coefficient would be -1.00 and the data points would align themselves is a straight line as well but from the upper left-hand corner of the chart to the lower right. 9/18/2018 Madgerie Jameson UWI School of Education
Limitations of the correlation coefficient The fact that two variables are related does not mean that one causes the other. The Pearson’s r correlation coefficient assumes linear relationships, that is relationships where higher scores on x are linearly related to higher scores on y. Not all relationships follow this form. There are curvilinear relationships, for example, 9/18/2018 Madgerie Jameson UWI School of Education
Curvilinear Relationships Pearson's r will not be able to tell whether or not the relationship is curvilinear. Other tests will have to be carried out to determine whether the slope is curvilinear. 9/18/2018 Madgerie Jameson UWI School of Education
Limitations continued If the range of the variables measured is small the correlation coefficient will be artificially low. ( restriction of range). For example relationship between “A” level grades ( A to C) and university course grades ( 1 – 3). Correlations can be affected by outliers (unusual cases) 9/18/2018 Madgerie Jameson UWI School of Education
Interpreting a Correlation Coefficient Size of Correlation Coefficient General Interpretation .8 to 1,0 Very strong relationship .6 to .8 Strong relationship .4 to .6 Moderate relationship .2 to .4 Weak relationship .o to .2 Weak or no relationship This is effective for a quick assessment of the strength of a relationship. A more precise way to interpret correlation coefficient is by computing the coefficient of determination 9/18/2018 Madgerie Jameson UWI School of Education
The Coefficient of Determination The percentage of variance in one variable that is accounted for by the variance in the other variable. To determine exactly how much of the variance in one variable can be accounted for by the variance in another variable, the coefficient of determination is computed by squaring the correlation coefficient. 9/18/2018 Madgerie Jameson UWI School of Education
For Example If the correlation between your test results and the amount of time you spend studying is .70 than the coefficient of determination represented by r2 is .72 or .49. This means that 49% of the variance in your test results can be explained by the variance in your study time. The stronger the correlation the more variance can be explained. 9/18/2018 Madgerie Jameson UWI School of Education
However If 49% of the variance can be explained this means that 51% cannot. Even in a strong correlation of .7. The amount of unexplained variance is called the coefficient of alienation or coefficient of nondetermintation. 9/18/2018 Madgerie Jameson UWI School of Education
How variables share variance and the resulting correlation Coefficient of determination Variable X Variable Y Rxy = 0 R2xy = 0 Rxy = .5 R2xy = .25 or 25% Rxy = .9 R2xy =.81 or 81% X 0% shared Y x y 25% shared x y 81% shared 9/18/2018 Madgerie Jameson UWI School of Education
In Essence The first diagram shows two circles that do not touch. They do not touch because they do not share anything in common. The correlation is zero. The second diagram shows two circles that overlap. With a correlation of .5 and r2 of .25. They share about 25% of the variance between themselves. The third diagram shows that the two circles are almost placed one on top the other. With an almost perfect correlation of r = .9 and r2 = .81, they share 81% of the variance between them. 9/18/2018 Madgerie Jameson UWI School of Education
How to Report Correlation Coefficients Reporting correlation coefficients is straightforward. You have to say how big they are and what their significance value was. Note There should be no zero before the decimal point for the correlation coefficient or the probability value because they do not exceed 1. Coefficients are reported to two decimal places. If you are quoting two a one tailed probability you should say so. Each coefficient correlation is represented by a different letter There are standard criteria of probabilities that we use ( .05, .01 and .001). 9/18/2018 Madgerie Jameson UWI School of Education
Examples There was a significant relationship between the number of hours studied and students examination results, r = .83 ( one tailed) < .05 Exam performance was significantly correlated with exam anxiety, r = -.44, and time spent revising, r = .40; the time spent revising was also correlated with exam anxiety, r = .71 ( all ps <.01). Exam performance Exam Anxiety Revision time 1 -.44*** .40*** Exam anxiety -.71*** Ns = not significant, (p > .05), *p <.05,**p <.01, ***p<.001 9/18/2018 Madgerie Jameson UWI School of Education
Summary Correlation is a technique that gives an indication of the linear association between two variables. It tells us the strength of the association. Correlations are standardised they can only range from - 1.0 to + 1.0. The coefficient of determination( r2) tells us the present variance in the dependent variable (Y) that is explained by the independent variable (x). 9/18/2018 Madgerie Jameson UWI School of Education
Practice time Class activity on handout. 9/18/2018 Madgerie Jameson UWI School of Education
9/18/2018 Madgerie Jameson UWI School of Education