Presentation is loading. Please wait.

Presentation is loading. Please wait.

Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009.

Similar presentations


Presentation on theme: "Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009."— Presentation transcript:

1 Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009

2 Correlation

3 Sometimes our research questions are concerned with finding the relationship between two variables Usually, these questions seek to observe these variables as they exist naturally in the world – the researcher is not trying to manipulate, but is observing what occurs Often this type of research does not allow easy definition of ‘levels’ of the independent variable

4 Correlation Is coffee drinking related to nervousness? Is sugar consumption related to hyperactivity in children? Are beer and coffee sales related to temperature? These type of questions are suited to a statistical technique known as correlation analysis

5 Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value 4.Calculate the critical value (size of  ) 5.Make our conclusion

6 Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value - calculate r 4.Calculate the critical value (size of  ) 5.Make our conclusion

7 Characteristics of Correlation Analyses - 1 With correlational data, we don’t calculate a mean score for each condition – we don’t figure out mean beer sales in January, February, March and so on Instead, the correlation coefficient [r] summarizes the entire relationship

8 Characteristics of Correlation Analyses - 2 We always examine the relationship between pairs of scores – sugar consumption and hyperactivity – age and income – beer sales and temperature So, N is the number of pairs of scores in the data

9 Characteristics of Correlation Analyses - 3 Neither variable is called the independent or dependent – sugar consumption and hyperactivity – age and income – beer sales and temperature

10 Characteristics of Correlation Analyses - 4 We graph the scores differently in correlational research – we use a scatterplot to visualize our data A scatterplot is a graph that shows the location of each data point formed by a pair of X-Y scores When a relationship exists, a particular value of Y tends to be paired with one value of X and another value of Y tends to be paired with a different value of X

11 Characteristics of Correlation Analyses - 5 Correlation is not causation Just because we observe a relationship between two variables, does not mean that changes in one of the variables causes changes in the other – Television watching and aggression

12 Scatterplot Coffee Nervousness 1 1 1 2 2 2 2 3 3 4 3 5 4 5 4 6 58 59 69 610

13 Scatterplot Coffee Nervousness 1 1 1 2 2 2 2 3 3 4 3 5 4 5 4 6 58 59 69 610

14 Relationships Two aspects of relationships: Type of relationship – shape – direction Strength of relationship – correlation coefficient – test of significance

15 Types of Relationship The type of relationship in a dataset can be thought of as the overall direction in which the scores on Y change as the X scores change – does knowing about variable 1 help you know something about variable 2? There are two main types of relationship – Linear – Nonlinear

16 Linear Relationships A linear relationship forms a pattern on a scatterplot that fits a straight line In a positive linear relationship, as the scores on the X variable increase, the scores on the Y variable also tend to increase In a negative linear relationship, as the scores on the X variable increase, the scores on the Y variable tend to decrease

17 Linear Relationship

18 Linear Relationships Positive relationship: more X leads to more Y Negative relationship: more X leads to less Y What is the relationship between study time and test scores? What is the relationship between hours of tv watched and hours slept?

19 Positive Linear Relationship

20 Negative Linear Relationship

21 Nonlinear Relationships A nonlinear relationship does not fit a straight line What is the relationship between stress and exam performance? – Low stress levels: suboptimal – High stress levels: suboptimal – Moderate stress levels: optimal performance Common shapes of nonlinear relationships are U- shaped and inverted U-shaped

22 Nonlinear Relationship

23 Examples 1. X Y 6966 6371 6470 6570 6475 6270 6872 7468 6372 6575 2. X Y 63 32 642 63 31 52 73 21 41 3. X Y 403 302.6 103.2 153.8 403.7 452.8 503.4 202 153.3 253.8 4. X Y 647 657.5 6910 637.5 647 6510 647 626.5 689 7412 5. X Y 205 2515 3520 4030 5045 5540 7020 8020 9010 9510

24 1: Relationship?

25 Negative Linear Relationship Mother’s height (inches) Father’s height (inches)

26 2: Relationship?

27 Positive Linear Relationship Excited about Course (0-10) Willing to ask question (0- 7)

28 3: Relationship?

29 No Relationship Last Haircut ($) GPA

30 4: Relationship?

31 Positive Linear Relationship Height (inches) Shoe size

32 5: Relationship

33 Nonlinear Relationship X Y

34 Strength of the Relationship The strength of a linear relationship is the degree to which one value of Y is consistently paired with one and only one value of X r can vary between -1 and +1 We measure the strength of the relationship with the correlation coefficient: r The larger the absolute value of the correlation coefficient, the stronger the relationship The sign of the correlation coefficient indicates the direction of a linear relationship – negative: negative relationship – positive: positive relationship

35 Strength of the Relationship The strength of a linear relationship is the degree to which one value of Y is consistently paired with one and only one value of X

36 Strength of the Relationship Describe the relationships between the variables which have the following correlations: A and B: R = 0.05 C and D: R = -0.73 E and F: R = 0.96 G and H: R = 0.39 I and J: R = -0.16

37 Strength of the Relationship Describe the relationships between the variables which have the following correlations (in terms of strong vs. weak, positive versus negative): A and B: R = 0.05none C and D: R = -0.73strong negative E and F: R = 0.96 strong positive G and H: R = 0.39moderate positive I and J: R = -0.16 weak negative

38 Strength of the Relationship Estimate the correlation of the following relationships:

39 Strength of the Relationship Estimate the correlation of the following relationships: r approx +0.90r approx 0.00

40 What is r? The pearson product moment correlation coefficient: r = (ΣZxZy) / N Z-scores tell us about distance from the mean The sum of squared Z-scores for a variable is equal N x=1,5,6,7,8,9 z x = -1.7677670 -0.3535534 0.0000000 0.3535534 0.7071068 1.0606602 z x 2 =3.125 0.125 0.000 0.125 0.500 1.125 Σ z x 2 = 5= N Therefore the closer Zx is to Zy the closer to one the correlation will be. If one of them is negative and the other is positive you get a negative correlation If both are negative or positive you get a positive correlation

41 Calculating R To measure the strength of a linear relationship, we will use the Pearson correlation coefficient [r] – this will be the obtained value for the statistical test The computational formula for the correlation coefficient is:

42 Calculating R Calculate the correlation coefficient for the following dataset: X Y 18 26 36 45 51 63

43 Calculating R Calculate the correlation coefficient for the following dataset: XX2X2 YY2Y2 XY 1 1864 8 2 463612 3 963618 41652520 5251 1 5 6363 918  X = 21  X 2 = 91  Y = 29  Y 2 = 171  XY = 81

44 Calculating R

45 Your Turn Calculate R for the following dataset X Y 63 32 642 63 31 52 73 21 41

46 Your Turn XX2X2 YY2Y2 XY 6 363918 3 9246 6 3641624 24244 63639 18 39113 5252410 7493921 24112 416114  X = 44  X 2 = 224  Y = 22  Y 2 = 58  XY = 110

47 Your Turn  X = 44;  X 2 = 224;  Y = 22;  Y 2 = 58;  XY = 110 r = 10(110) - (44)(22) /  {[10(224) - (44) 2 ][10(58) - (22) 2 ]} r = 1100 - 968 /  {[2240 - 1936][580 - 484]} r = 132 /  {[304][96]} r = 132 /  29184 r = 132 / 170.833= r = 0.773

48 Positive Linear Relationship Excited about Course (0-10) Willing to ask question (0- 7) r=+0.773

49 Statistically testing correlations The correlation coefficient [r] tells us something about the strength and direction of the linear relationship But, we often want to know whether this relationship could have happened by chance or whether it is a real, significant, relationship – we have a correlation coefficient of +0.773 for the relationship between excitement about the class and willingness to ask questions – does this indicate a real relationship? What are the chances that this could have happened by fluke?

50 Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value 4.Calculate the critical value (size of  ) 5.Make our conclusion

51 Statistical Testing 1.Decide which test to use 2.State the hypotheses (H 0 and H 1 ) 3.Calculate the obtained value - calculate r 4.Calculate the critical value (size of  ) 5.Make our conclusion

52 1. Decide which test to use Are we looking for the relationship between variables? – Yes: Use the Correlation test

53 2. State the Hypotheses Though we are testing samples, again, we are really interested in the total population The population correlation is described by  (rho) The null hypothesis (H 0 ) always states that there is no relationship between the variables H 0 :  = 0 excitement about course is not related to willingness to ask questions H 1 :   0 excitement about course is related to willingness to ask questions

54 Plotting the correlation aaa  a Values of correlation coefficient

55 r crit and r obt a aa a r crit =-0.67 r obt =-0.78 Values of correlation coefficient r crit =+0.67

56 r crit and r obt aaa a r crit =-0.67 r obt =+0.33 Values of correlation coefficient r crit =+0.67 Values of correlation coefficient

57 3. Calculate r obt We calculate r obt using the formula: r obt = +0.773

58 4. Calculate the critical value Assume  =0.05 We are looking for any relationship (positive or negative), therefore it will be a two-tailed test df = N - 2 (where N is the number of pairs in the data) df = (9 - 2) = 7 Look up Table 3 – critical values of the Pearson Correlation Coefficient: the r-tables Two-tailed Test df  =.05  =.01 70.6660.798 r crit =  0.666

59 r crit and r obt aaa a r crit =-0.666 r obt =+0.773 Values of correlation coefficient r crit =+0.666

60 5. Make our Conclusion r crit =  0.67 r obt = +0.773 As r obt is inside the rejection region, we reject H 0 and accept H 1 We conclude that there is a significant positive relationship between excitement about a course and a willingness to ask questions in it (p < 0.05)

61 Significance and Importance We conclude that there is a significant positive relationship between excitement about a course and a willingness to ask questions in it (p < 0.05) How important is this finding? What proportion of the variability in people’s willingness to ask questions is related to excitement about the course (or vice versa)? We can answer this with the Effect size: r 2 r = 0.773 r 2 = 0.598 – around 60%

62 Your Turn A researcher asks if there is a relationship between the number of errors on a statistics exam and the person’s level of satisfaction with the course. Is there a significant relationship between these variables? Is it important? Errors Satisfaction 9 3 8 2 4 8 6 5 7 4 10 2 5 7

63 1. Decide which test to use Are we looking for the relationship between variables? – Yes: Use the Correlation test

64 2. State the Hypotheses H 0 :  = 0 there is no relationship between errors made on the exam and satisfaction with the course H 1 :   0 there is a relationship between errors made on the exam and satisfaction with the course

65 3. Calculate r obt

66

67 XX2X2 YY2Y2 XY 98139 27 8642416 4 86432 63652530 7494 1628 101002 420 52574935  X = 49  X 2 = 371  Y = 31  Y 2 = 171  XY = 188

68 Your Turn  X = 49;  X 2 = 371; (  X) 2 = 2401;  Y = 31;  Y 2 = 171;(  Y) 2 = 961;  XY = 188N = 7 r = 7(110) - (49)(31) /  {[7(371) - 2401][7(171) - 961]} r = 1316 - 1519 /  {[2597 - 2401][1197 - 961]} r = -203 /  {[196][236]} r = - 203 /  46256 r = - 203 / 215.072= r = -0.94

69 4. Calculate the critical value Assume  =0.05 We are looking for any relationship (positive or negative), therefore it will be a two-tailed test df = N - 2 (where N is the number of pairs in the data) df = (7 - 2) = 5 Look up Table 3 – critical values of the Pearson Correlation Coefficient: the r-tables Two-tailed Test df  =.05  =.01 50.7540.874 r crit =  0.754

70 r crit and r obt aaa  a r crit =-0.754 r obt =-0.94 Values of correlation coefficient r crit =+0.754

71 5. Make our Conclusion r crit =  0.754 r obt = -0.94 As r obt is inside the rejection region, we reject H 0 and accept H 1 We conclude that there is a significant negative relationship between errors made on a test and satisfaction with the course (p < 0.05) – more errors made, less satisfaction

72 Significance and Importance We conclude that there is a significant negative relationship between errors made on a test and satisfaction with the course (p < 0.05) Importance – Effect size: r 2 r = -0.94 r 2 = 0.88 – around 88% of the differences in satisfaction scores are related to the errors made on the exam

73 Homework Chapter 8: 2, 6, 8


Download ppt "Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009."

Similar presentations


Ads by Google