Download presentation
Presentation is loading. Please wait.
1
Correlation
2
Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression
3
Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression
4
Relationship Between Two Numerical Variables
6
Correlation What is the tendency of two numerical variables to co-vary (change together)?
7
Correlation What is the tendency of two numerical variables to co-vary (change together)? Correlation coefficient r measures the strength and direction of the linear association between two numerical variables
8
Correlation What is the tendency of two numerical variables to co-vary (change together)? Correlation coefficient r measures the strength and direction of the linear association between two numerical variables Population parameter: (rho) Sample estimate: r
10
Sum of squares: X and Y
11
Sum of products
12
Shortcuts
13
r r r r
14
Correlation assumes... Random sample X is normally distributed with equal variance for all values of Y Y is normally distributed with equal variance for all values of X
15
Correlation assumes... Random sample X is normally distributed with equal variance for all values of Y Y is normally distributed with equal variance for all values of X Bivariate normal distribution
16
Correlation coefficient facts -1 < < 1; -1 < r < 1
17
Correlation coefficient facts -1 < < 1; -1 < r < 1 Positive r: variables increase together Negative r: when one variable increases, the other decreases, and vice-versa
18
Correlation coefficient facts -1 < < 1; -1 < r < 1 Positive r: variables increase together Negative r: when one variable increases, the other decreases, and vice-versa r=0 r = 1r = -1 uncorrelatedpositivenegative
19
Correlation coefficient facts Coefficient of determination = r 2 Describes the proportion of variation in one variable that can be predicted from the other
20
Standard error of r
21
Confidence Limits for r
22
Example Are the effects of new mutations on mating success and productivity correlated? Data from Drosophila melanogaster n = 31 individuals
23
X is productivity, Y is the mating success Sum of products = 2.796 Sum of squares for X = 16.245 Sum of squares for Y = 1.6289
24
X is productivity, Y is the mating success
27
Confidence Limits for r
33
Example: Why Sleep?
34
10 experimental subjects Measured increase in “slow-wave” activity during sleep Measured improvement in task after sleep - hand-eye coordination activity
35
Example: Why Sleep?
36
Why sleep? Sum of products: 1127.4 Sum of squares X: 2052.4 Sum of squares Y: 830.9 Calculate a 95% C.I. for
37
Hypothesis Testing for Correlations Can test hypotheses relating to correlations among variables Closely related to regression - the topic for next Tuesday’s lecture
38
Hypothesis Testing for Correlations H 0 : = 0 H A : 0
39
If = 0,... r is normally distributed with mean 0 with df = n -2
40
Example Are the effects of new mutations on mating success and productivity correlated? Data from Drosophila melanogaster
41
Hypotheses H 0 : Mating success and productivity are not related ( = 0) H A : Mating success and productivity are correlated ( 0)
42
X is productivity, Y is the mating success Sum of products = 2.796 Sum of squares for X = 16.245 Sum of squares for Y = 1.6289
45
df= n-2=31-2=29
47
Why sleep? Sum of products: 1127.4 Sum of squares X: 2052.4 Sum of squares Y: 830.9 Test for a correlation different from zero in these data.
48
Checking Assumptions for Correlation Bivariate normal distribution –Relationship is linear (straight line) –Cloud of points in scatter plot is circular or elliptical –Frequency distributions of X and Y are normal
49
Linear Relationship?
51
Maximum correlation possible
52
Maximum correlation possible Correlation of zero
53
Maximum correlation possible Correlation of zero
54
Cloud of points elliptical?
55
X and Y normal? Use usual techniques for both X and Y separately Be wary of outliers
56
Quick Reference Guide - Correlation Coefficient What is it for? Measuring the strength of a linear association between two numerical variables What does it assume? Bivariate normality and random sampling Parameter: Estimate: r Formulae:
57
Quick Reference Guide - t-test for zero linear correlation What is it for? To test the null hypothesis that the population parameter, , is zero What does it assume? Bivariate normality and random sampling Test statistic: t Null distribution: t with n-2 degrees of freedom Formulae:
58
Sample Test statistic Null hypothesis =0 Null distribution t with n-2 d.f. compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o Fail to reject H o T-test for correlation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.