Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation and Linear Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.

Similar presentations


Presentation on theme: "Correlation and Linear Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research."— Presentation transcript:

1 Correlation and Linear Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

2 CONTENTS Correlation coefficients meaning values role significance Regression line of best fit prediction significance 2

3 INTRODUCTION Correlation the strength of the linear relationship between two variables Regression analysis determines the nature of the relationship For example - Is there a relationship between the number of units of alcohol consumed and the likelihood of developing cirrhosis of the liver? 3

4 PEARSON’S COEFFICIENT OF CORRELATION (r) Measures the strength of the linear relationship between one dependent and one independent variable curvilinear relationships need other techniques Values lie between +1 and -1 perfect positive correlation r = +1 perfect negative correlation r = -1 no linear relationship r = 0 4

5 PEARSON’S COEFFICIENT OF CORRELATION 5 r = +1 r = -1 r = 0.6 r = 0

6 SCATTER PLOT 6 dependent variable make inferences about independent variable Calcium intake BMD

7 NON-NORMAL DATA 7

8 NORMALISED WITH LOG TRANSFORMATION 8

9 SPSS OUTPUT: SCATTER PLOT 9

10 SPSS OUTPUT: CORRELATIONS 10

11 11 Interpreting correlation Large r does not necessarily imply: strong correlation r tends to increase with sample size cause and effect strong correlation between the number of televisions sold and the number of cases of paranoid schizophrenia watching TV causes paranoid schizophrenia may be due to indirect relationship

12 12 Interpreting correlation Variation in dependent variable due to: relationship with independent variable: r 2 random noise: 1 - r 2 r 2 is the Coefficient of Determination or Variation explained e.g. r = 0.661 r 2 = = 0.44 less than half of the variation (44%) in the dependent variable due to independent variable

13 13

14 14 Agreement Correlation should never be used to determine the level of agreement between repeated measures: measuring devices users techniques It measures the degree of linear relationship You can have high correlation with poor agreement

15 15 Non-parametric correlation Make no assumptions Carried out on ranks Spearman’s  easy to calculate Kendall’s  has some advantages over  distribution has better statistical properties easier to identify concordant / discordant pairs Usually both lead to same conclusions

16 16 Role of regression Shows how one variable changes with another By determining the line of best fit Default is linear Curvilinear?

17 17 Line of best fit Simplest case linear Line of best fit between: dependent variable Y BMD independent variable X dietary intake of Calcium value of Y when X=0 Y = a + bX change in Y when X increases by 1

18 18 Role of regression Used to predict or explore associations the value of the dependent variable when value of independent variable(s) known within the range of the known data extrapolation is risky! relation between age and bone age Does not imply causality

19 SPSS OUTPUT: REGRESSION 19

20 20 Multiple regression Later - More than one independent variable BMD may be dependent on: age gender calorific intake Use of bisphosphonates Exercise etc

21 21 Summary Correlation strength of linear relationship between two variables Pearson’s - parametric Spearman’s / Kendall’s non-parametric Interpret with care! Regression line of best fit prediction Multiple regression logistic


Download ppt "Correlation and Linear Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research."

Similar presentations


Ads by Google