Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stats Club Marnie Brennan

Similar presentations


Presentation on theme: "Stats Club Marnie Brennan"— Presentation transcript:

1 Stats Club Marnie Brennan
Correlation Stats Club Marnie Brennan

2 Have you used correlation before?

3 References Petrie and Sabin - Medical Statistics at a Glance: Chapter 26 Good Petrie and Watson - Statistics for Veterinary and Animal Science: Chapter 10 & 12 (12.7) Good Kirkwood and Sterne – Essential Medical Statistics: Chapter 10 & 30 Thrusfield – Veterinary Epidemiology: Chapter 14 (pges )

4 Other reads Mathematics Learning Support Centre – Statistics: Correlation - Good Bewick, V, Cheek, L and Ball, J (2003) Statistics review 7: Correlation and regression. Critical Care, Vol. 7, Swinscow, TDV (1976) Statistics at square one: XVIII – Correlation. British Medical Journal, Vol. 2, Swinscow, TDV (1976) Statistics at square one: XIX – Correlation (continued). British Medical Journal, Vol. 2, Swinscow, TDV (1976) Statistics at square one: XIX – Correlation (concluded). British Medical Journal, Vol. 2,

5

6 What is correlation? Use it to look at the relationship between two continuous (numerical) variables To see if you have a linear relationship between them If you interchanged the x and y axes, you would still have the same relationship To measure the degree of the relationship

7 How does it differ from other calculations?
T-tests and ANOVAs These measure the differences between subsets within your data/between groups Regression (this will be covered in a later session) This describes the linear relationship between two variables Describes how one variable (independent) predicts the other (dependent) You cannot interchange the variables between the x and y axes

8 When would you use correlation?
To see if there is a relationship between two variables, and how strong it is As a prequel to linear regression If variables are highly correlated (or collinear), this will effect how they interact in a linear regression calculation Therefore, you need to know whether variables are correlated or not before you do a linear regression Other reasons??

9 When wouldn’t you use correlation!
To compare two different methods of measurement on the same thing (reliability) E.g. How many neutrophils are in a blood sample? Compare using an IDEXX machine versus counting on a blood smear To compare the same method of measurement but used multiple times (reproducibility) E.g. How many neutrophils are in a blood sample? Preparing two blood smears from the same sample, and comparing the results Kappa (and associated) analysis – covered in a previous session Categorical data

10 First step – descriptive stats
Scatter plot What does the relationship look like?

11 Shape and what does it mean?
Does it increase or decrease as the values get higher? Does the relationship look linear? How ‘steep’ is the slope of the shape? Are there any outliers? Taken from Petrie and Sabin

12 Scatter plot examples

13 Demonstration if required….

14 Demonstration if required….
GenStat Graphics, 2D Scatter Plot (Y variate, X variate) Run

15 Measurement for correlation
Correlation coefficient Numerical representation of the degree of association between the variables Between -1 and +1 Positive – the relationship is increasing with increasing values Negative – the relationship is decreasing with decreasing values It is dimensionless (spooky….) No units of measurement

16 What are the assumptions that have to be satisfied?
There is a linear relationship between the variables There are no outliers present There are no subgroups within the data that affect the relationship e.g. sex Multiple data from the same subjects Taken from Petrie and Sabin

17 Represented as r (rho) Calculation of correlation coefficient
Pearson’s coefficient – r If our null hypothesis is that there is no relationship e.g. the correlation coefficient is zero At least one variable has to be normally distributed Taken from Petrie and Sabin

18 P-values and confidence intervals
As standard with p-values (you can calculate confidence intervals – not normally included) You state your cut off e.g. if p<0.05, reject the null hypothesis and conclude that there is a relationship A significant result is not necessarily CAUSAL There is an ASSOCIATION

19 Minitab

20 Minitab – Pearsons

21 SPSS

22 SPSS - Pearsons

23 GenStat GenStat Stat, summary, correlations

24 What if our assumptions cannot be met?
Use Spearman’s rank correlation coefficient (or Kendall’s coefficient) Rank your variables from lowest to highest, and then calculate the correlation coefficient Still between -1 and 1 Remove outliers Rule of thumb – if the outliers are outside +/-2 standard deviations away from the group mean, you can remove them Transform data into a linear relationship????

25 Minitab – Spearman’s

26 Minitab – Spearman’s

27 SPSS – Spearman’s

28 SPSS – Spearman’s

29 GenStat GenStat Stat, summary, correlations Rank??

30 How does this fit with what you do or have seen/experienced?

31 Summary Make sure you are using correlation in the correct way
Eyeball first with a scatter plot – look for strength, slope, relationship, outliers If it doesn’t fit the assumptions, use a non-parametric equivalent e.g. Spearman’s

32 Next time… Linear regression……


Download ppt "Stats Club Marnie Brennan"

Similar presentations


Ads by Google