Download presentation
Presentation is loading. Please wait.
Published byMaximilian Booker Modified over 9 years ago
1
Correlation and simple linear regression Marek Majdan Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
2
Correlation and simple linear regression Both are used to analyse paired data Purpose of both procedures is to explore relationships between two variables Correlation handles variables as equal SLR distinguishes between independent variable (X) and dependent variable (Y) Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
3
Scatter plot Scatter plot is used to primary visual exploration of the relationship between two variables Relationship: linear or non-linear Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
4
Correlation Computes correlation coefficient as a measure of size and direction of relationship Correlation coefficients: number from -1 to +1 including 0, which marks independence of the analyzed variables Both variables are numerical In R: cor.test (x,y) Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
5
Simple Linear Regression Used to analyze how the independent variable influences the dependent variable Relationship expressed as a regression equation: y=a+bx y=dependent variable a=intercept (value of Y corresponding to X=0) b=regression coefficient x=independent variable Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
6
Simple Linear Regression Regression line is a graphical representation of the regression equation – a straight line is drawn over the scatterplot of the analyzed data Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
7
Simple Linear Regression Method of least squared: regression line slope is calculated so that the sum of squares of distances of each point from the line is the smallest possible This way the line ‘explains’ the points to the highest possible degree The degree to which the line explains the points is expressed by the coefficient of determination, ‘R 2 ’ (a number between 0 and 1, or 0 and 100%) Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
8
Simple Linear Regression In R: regression analysis: lm (y~x, data=database) regression line: plot (x,y) abline(lm(y~x), data=database) Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.