Download presentation
Presentation is loading. Please wait.
Published byCaroline Hart Modified over 9 years ago
1
Tutorial 4 MBP 1010 Kevin Brown
2
Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1 (perfect positive linear correlation). 0 indicates no linear association. – Location and scale independent
3
Linear Regression
4
Requires you to define? Y – independent variable X – dependent variable(s)
5
Allows you to answer what questions? Is there an association (same question as the Pearson correlation coefficient) What is the association? Measured as the slope.
6
Assumes Linearity Constant residual variance (homoscedasticity) / residuals normal Errors are independent (i.e. not clustered)
7
Homogeneity of variance
8
Outputs “estimates” intercept slope standard errors t values p-values residual standard error (SSE – what is this?) R 2
9
Linear regression example: height vs. weight Extract information: > summary(lm(HW[,2] ~ HW[,1])) Call: lm(formula = HW[, 2] ~ HW[, 1]) Residuals: Min 1Q Median 3Q Max -36.490 -10.297 3.426 9.156 37.385 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.860 18.304 -0.156 0.876 HW[, 1] 42.090 9.449 4.454 5.02e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 16.12 on 48 degrees of freedom Multiple R-squared: 0.2925,Adjusted R-squared: 0.2777 F-statistic: 19.84 on 1 and 38 DF, p-value: 5.022e-05
10
Linear regression example: height vs. weight Extract information: > summary(lm(HW[,2] ~ HW[,1])) Call: lm(formula = HW[, 2] ~ HW[, 1]) Residuals: Min 1Q Median 3Q Max -36.490 -10.297 3.426 9.156 37.385 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.860 18.304 -0.156 0.876 HW[, 1] 42.090 9.449 4.454 5.02e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 16.12 on 48 degrees of freedom Multiple R-squared: 0.2925,Adjusted R-squared: 0.2777 F-statistic: 19.84 on 1 and 38 DF, p-value: 5.022e-05
11
Example Televisions, Physicians and Life Expectancy (World Almanac Factbook 1993) example – Residuals & Outliers – High leverage points & influential observations – Dummy variable coding – Transformations Take home messages – Regression is a very flexible tool – correlation ≠ causation
12
Dummy coding Creates an alternate variable that’s used for analysis For 2 categories you set values of … – reference level to 0 – level of interest to 1
13
Residuals and Outliers
14
High Leverage Points and Influential Observations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.