Download presentation
Presentation is loading. Please wait.
Published byErika Ramsey Modified over 9 years ago
1
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009
2
Correlation
3
Sometimes our research questions are concerned with finding the relationship between two variables Usually, these questions seek to observe these variables as they exist naturally in the world – the researcher is not trying to manipulate, but is observing what occurs Often this type of research does not allow easy definition of ‘levels’ of the independent variable
4
Correlation Is coffee drinking related to nervousness? Is sugar consumption related to hyperactivity in children? Are beer and coffee sales related to temperature? These type of questions are suited to a statistical technique known as correlation analysis
5
Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value 4.Calculate the critical value (size of ) 5.Make our conclusion
6
Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value - calculate r 4.Calculate the critical value (size of ) 5.Make our conclusion
7
Characteristics of Correlation Analyses - 1 With correlational data, we don’t calculate a mean score for each condition – we don’t figure out mean beer sales in January, February, March and so on Instead, the correlation coefficient [r] summarizes the entire relationship
8
Characteristics of Correlation Analyses - 2 We always examine the relationship between pairs of scores – sugar consumption and hyperactivity – age and income – beer sales and temperature So, N is the number of pairs of scores in the data
9
Characteristics of Correlation Analyses - 3 Neither variable is called the independent or dependent – sugar consumption and hyperactivity – age and income – beer sales and temperature
10
Characteristics of Correlation Analyses - 4 We graph the scores differently in correlational research – we use a scatterplot to visualize our data A scatterplot is a graph that shows the location of each data point formed by a pair of X-Y scores When a relationship exists, a particular value of Y tends to be paired with one value of X and another value of Y tends to be paired with a different value of X
11
Characteristics of Correlation Analyses - 5 Correlation is not causation Just because we observe a relationship between two variables, does not mean that changes in one of the variables causes changes in the other – Television watching and aggression
12
Scatterplot Coffee Nervousness 1 1 1 2 2 2 2 3 3 4 3 5 4 5 4 6 58 59 69 610
13
Scatterplot Coffee Nervousness 1 1 1 2 2 2 2 3 3 4 3 5 4 5 4 6 58 59 69 610
14
Relationships Two aspects of relationships: Type of relationship – shape – direction Strength of relationship – correlation coefficient – test of significance
15
Types of Relationship The type of relationship in a dataset can be thought of as the overall direction in which the scores on Y change as the X scores change – does knowing about variable 1 help you know something about variable 2? There are two main types of relationship – Linear – Nonlinear
16
Linear Relationships A linear relationship forms a pattern on a scatterplot that fits a straight line In a positive linear relationship, as the scores on the X variable increase, the scores on the Y variable also tend to increase In a negative linear relationship, as the scores on the X variable increase, the scores on the Y variable tend to decrease
17
Linear Relationship
18
Linear Relationships Positive relationship: more X leads to more Y Negative relationship: more X leads to less Y What is the relationship between study time and test scores? What is the relationship between hours of tv watched and hours slept?
19
Positive Linear Relationship
20
Negative Linear Relationship
21
Nonlinear Relationships A nonlinear relationship does not fit a straight line What is the relationship between stress and exam performance? – Low stress levels: suboptimal – High stress levels: suboptimal – Moderate stress levels: optimal performance Common shapes of nonlinear relationships are U- shaped and inverted U-shaped
22
Nonlinear Relationship
23
Examples 1. X Y 6966 6371 6470 6570 6475 6270 6872 7468 6372 6575 2. X Y 63 32 642 63 31 52 73 21 41 3. X Y 403 302.6 103.2 153.8 403.7 452.8 503.4 202 153.3 253.8 4. X Y 647 657.5 6910 637.5 647 6510 647 626.5 689 7412 5. X Y 205 2515 3520 4030 5045 5540 7020 8020 9010 9510
24
1: Relationship?
25
Negative Linear Relationship Mother’s height (inches) Father’s height (inches)
26
2: Relationship?
27
Positive Linear Relationship Excited about Course (0-10) Willing to ask question (0- 7)
28
3: Relationship?
29
No Relationship Last Haircut ($) GPA
30
4: Relationship?
31
Positive Linear Relationship Height (inches) Shoe size
32
5: Relationship
33
Nonlinear Relationship X Y
34
Strength of the Relationship The strength of a linear relationship is the degree to which one value of Y is consistently paired with one and only one value of X r can vary between -1 and +1 We measure the strength of the relationship with the correlation coefficient: r The larger the absolute value of the correlation coefficient, the stronger the relationship The sign of the correlation coefficient indicates the direction of a linear relationship – negative: negative relationship – positive: positive relationship
35
Strength of the Relationship The strength of a linear relationship is the degree to which one value of Y is consistently paired with one and only one value of X
36
Strength of the Relationship Describe the relationships between the variables which have the following correlations: A and B: R = 0.05 C and D: R = -0.73 E and F: R = 0.96 G and H: R = 0.39 I and J: R = -0.16
37
Strength of the Relationship Describe the relationships between the variables which have the following correlations (in terms of strong vs. weak, positive versus negative): A and B: R = 0.05none C and D: R = -0.73strong negative E and F: R = 0.96 strong positive G and H: R = 0.39moderate positive I and J: R = -0.16 weak negative
38
Strength of the Relationship Estimate the correlation of the following relationships:
39
Strength of the Relationship Estimate the correlation of the following relationships: r approx +0.90r approx 0.00
40
What is r? The pearson product moment correlation coefficient: r = (ΣZxZy) / N Z-scores tell us about distance from the mean The sum of squared Z-scores for a variable is equal N x=1,5,6,7,8,9 z x = -1.7677670 -0.3535534 0.0000000 0.3535534 0.7071068 1.0606602 z x 2 =3.125 0.125 0.000 0.125 0.500 1.125 Σ z x 2 = 5= N Therefore the closer Zx is to Zy the closer to one the correlation will be. If one of them is negative and the other is positive you get a negative correlation If both are negative or positive you get a positive correlation
41
Calculating R To measure the strength of a linear relationship, we will use the Pearson correlation coefficient [r] – this will be the obtained value for the statistical test The computational formula for the correlation coefficient is:
42
Calculating R Calculate the correlation coefficient for the following dataset: X Y 18 26 36 45 51 63
43
Calculating R Calculate the correlation coefficient for the following dataset: XX2X2 YY2Y2 XY 1 1864 8 2 463612 3 963618 41652520 5251 1 5 6363 918 X = 21 X 2 = 91 Y = 29 Y 2 = 171 XY = 81
44
Calculating R
45
Your Turn Calculate R for the following dataset X Y 63 32 642 63 31 52 73 21 41
46
Your Turn XX2X2 YY2Y2 XY 6 363918 3 9246 6 3641624 24244 63639 18 39113 5252410 7493921 24112 416114 X = 44 X 2 = 224 Y = 22 Y 2 = 58 XY = 110
47
Your Turn X = 44; X 2 = 224; Y = 22; Y 2 = 58; XY = 110 r = 10(110) - (44)(22) / {[10(224) - (44) 2 ][10(58) - (22) 2 ]} r = 1100 - 968 / {[2240 - 1936][580 - 484]} r = 132 / {[304][96]} r = 132 / 29184 r = 132 / 170.833= r = 0.773
48
Positive Linear Relationship Excited about Course (0-10) Willing to ask question (0- 7) r=+0.773
49
Statistically testing correlations The correlation coefficient [r] tells us something about the strength and direction of the linear relationship But, we often want to know whether this relationship could have happened by chance or whether it is a real, significant, relationship – we have a correlation coefficient of +0.773 for the relationship between excitement about the class and willingness to ask questions – does this indicate a real relationship? What are the chances that this could have happened by fluke?
50
Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value 4.Calculate the critical value (size of ) 5.Make our conclusion
51
Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value - calculate r 4.Calculate the critical value (size of ) 5.Make our conclusion
52
1. Decide which test to use Are we looking for the relationship between variables? – Yes: Use the Correlation test
53
2. State the Hypotheses Though we are testing samples, again, we are really interested in the total population The population correlation is described by (rho) The null hypothesis (H 0 ) always states that there is no relationship between the variables H 0 : = 0 excitement about course is not related to willingness to ask questions H 1 : 0 excitement about course is related to willingness to ask questions
54
Plotting the correlation aaa a Values of correlation coefficient
55
r crit and r obt a aa a r crit =-0.67 r obt =-0.78 Values of correlation coefficient r crit =+0.67
56
r crit and r obt aaa a r crit =-0.67 r obt =+0.33 Values of correlation coefficient r crit =+0.67 Values of correlation coefficient
57
3. Calculate r obt We calculate r obt using the formula: r obt = +0.773
58
4. Calculate the critical value Assume =0.05 We are looking for any relationship (positive or negative), therefore it will be a two-tailed test df = N - 2 (where N is the number of pairs in the data) df = (9 - 2) = 7 Look up Table 3 – critical values of the Pearson Correlation Coefficient: the r-tables Two-tailed Test df =.05 =.01 70.6660.798 r crit = 0.666
59
r crit and r obt aaa a r crit =-0.666 r obt =+0.773 Values of correlation coefficient r crit =+0.666
60
5. Make our Conclusion r crit = 0.67 r obt = +0.773 As r obt is inside the rejection region, we reject H 0 and accept H 1 We conclude that there is a significant positive relationship between excitement about a course and a willingness to ask questions in it (p < 0.05)
61
Significance and Importance We conclude that there is a significant positive relationship between excitement about a course and a willingness to ask questions in it (p < 0.05) How important is this finding? What proportion of the variability in people’s willingness to ask questions is related to excitement about the course (or vice versa)? We can answer this with the Effect size: r 2 r = 0.773 r 2 = 0.598 – around 60%
62
Your Turn A researcher asks if there is a relationship between the number of errors on a statistics exam and the person’s level of satisfaction with the course. Is there a significant relationship between these variables? Is it important? Errors Satisfaction 9 3 8 2 4 8 6 5 7 4 10 2 5 7
63
1. Decide which test to use Are we looking for the relationship between variables? – Yes: Use the Correlation test
64
2. State the Hypotheses H 0 : = 0 there is no relationship between errors made on the exam and satisfaction with the course H 1 : 0 there is a relationship between errors made on the exam and satisfaction with the course
65
3. Calculate r obt
67
XX2X2 YY2Y2 XY 98139 27 8642416 4 86432 63652530 7494 1628 101002 420 52574935 X = 49 X 2 = 371 Y = 31 Y 2 = 171 XY = 188
68
Your Turn X = 49; X 2 = 371; ( X) 2 = 2401; Y = 31; Y 2 = 171;( Y) 2 = 961; XY = 188N = 7 r = 7(110) - (49)(31) / {[7(371) - 2401][7(171) - 961]} r = 1316 - 1519 / {[2597 - 2401][1197 - 961]} r = -203 / {[196][236]} r = - 203 / 46256 r = - 203 / 215.072= r = -0.94
69
4. Calculate the critical value Assume =0.05 We are looking for any relationship (positive or negative), therefore it will be a two-tailed test df = N - 2 (where N is the number of pairs in the data) df = (7 - 2) = 5 Look up Table 3 – critical values of the Pearson Correlation Coefficient: the r-tables Two-tailed Test df =.05 =.01 50.7540.874 r crit = 0.754
70
r crit and r obt aaa a r crit =-0.754 r obt =-0.94 Values of correlation coefficient r crit =+0.754
71
5. Make our Conclusion r crit = 0.754 r obt = -0.94 As r obt is inside the rejection region, we reject H 0 and accept H 1 We conclude that there is a significant negative relationship between errors made on a test and satisfaction with the course (p < 0.05) – more errors made, less satisfaction
72
Significance and Importance We conclude that there is a significant negative relationship between errors made on a test and satisfaction with the course (p < 0.05) Importance – Effect size: r 2 r = -0.94 r 2 = 0.88 – around 88% of the differences in satisfaction scores are related to the errors made on the exam
73
Homework Chapter 8: 2, 6, 8
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.