Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.

Slides:



Advertisements
Similar presentations
BA 275 Quantitative Business Methods
Advertisements

Multiple Regression. Objectives Explanation The most direct interpretation of the regression variate is a determination of the relative importance of.
Review of Univariate Linear Regression BMTRY 726 3/4/14.
EPI 809/Spring Probability Distribution of Random Error.
5/18/ lecture 101 STATS 330: Lecture 10. 5/18/ lecture 102 Diagnostics 2 Aim of today’s lecture  To describe some more remedies for non-planar.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Multiple Regression Predicting a response with multiple explanatory variables.
Zinc Data SPH 247 Statistical Analysis of Laboratory Data.
Linear Regression Exploring relationships between two metric variables.
x y z The data as seen in R [1,] population city manager compensation [2,] [3,] [4,]
Chapter 12 Simple Regression
SPH 247 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
Crime? FBI records violent crime, z x y z [1,] [2,] [3,] [4,] [5,]
Some Analysis of Some Perch Catch Data 56 perch were caught in a freshwater lake in Finland Their weights, lengths, heights and widths were recorded It.
Correlation and Regression Analysis
Simple Linear Regression: An Introduction Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney.
Multiple Regression Analysis. General Linear Models  This framework includes:  Linear Regression  Analysis of Variance (ANOVA)  Analysis of Covariance.
Regression Transformations for Normality and to Simplify Relationships U.S. Coal Mine Production – 2011 Source:
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
How to plot x-y data and put statistics analysis on GLEON Fellowship Workshop January 14-18, 2013 Sunapee, NH Ari Santoso.
PCA Example Air pollution in 41 cities in the USA.
9/14/ Lecture 61 STATS 330: Lecture 6. 9/14/ Lecture 62 Inference for the Regression model Aim of today’s lecture: To discuss how we assess.
SWC Methodology - TWG February 19, 2015 Settlement Document Subject to I.R.E. 408.
Lecture 5: SLR Diagnostics (Continued) Correlation Introduction to Multiple Linear Regression BMTRY 701 Biostatistical Methods II.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Regression and Analysis Variance Linear Models in R.
Exercise 8.25 Stat 121 KJ Wang. Votes for Bush and Buchanan in all Florida Counties Palm Beach County (outlier)
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
Regression Model Building LPGA Golf Performance
Using R for Marketing Research Dan Toomey 2/23/2015
FACTORS AFFECTING HOUSING PRICES IN SYRACUSE Sample collected from Zillow in January, 2015 Urban Policy Class Exercise - Lecy.
Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.
Robust Regression V & R: Section 6.5 Denise Hum. Leila Saberi. Mi Lam.
Lecture 7: Multiple Linear Regression Interpretation with different types of predictors BMTRY 701 Biostatistical Methods II.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
Lecture 3 Linear Models II Olivier MISSA, Advanced Research Skills.
Linear Models Alan Lee Sample presentation for STATS 760.
1 Stat 6601 Presentation Presented by: Xiao Li (Winnie) Wenlai Wang Ke Xu Nov. 17, 2004 V & R 6.6.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
The Effect of Race on Wage by Region. To what extent were black males paid less than nonblack males in the same region with the same levels of education.
Nemours Biomedical Research Statistics April 9, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
1 Analysis of Variance (ANOVA) EPP 245/298 Statistical Analysis of Laboratory Data.
Chapter 12: Correlation and Linear Regression 1.
Before the class starts: Login to a computer Read the Data analysis assignment 1 on MyCourses If you use Stata: Start Stata Start a new do file Open the.
Lecture 11: Simple Linear Regression
Modeling in R Sanna Härkönen.
Chapter 12 Simple Linear Regression and Correlation
Résolution de l’ex 1 p40 t=c(2:12);N=c(55,90,135,245,403,665,1100,1810,3000,4450,7350) T=data.frame(t,N,y=log(N));T; > T t N y
CHAPTER 7 Linear Correlation & Regression Methods
Correlation and regression
Checking Regression Model Assumptions
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Checking Regression Model Assumptions
Console Editeur : myProg.R 1
Chapter 12 Simple Linear Regression and Correlation
Obtaining the Regression Line in R
Estimating the Variance of the Error Terms
Presentation transcript:

Tutorial 4 MBP 1010 Kevin Brown

Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1 (perfect positive linear correlation). 0 indicates no linear association. – Location and scale independent

Linear Regression

Requires you to define? Y – independent variable X – dependent variable(s)

Allows you to answer what questions? Is there an association (same question as the Pearson correlation coefficient) What is the association? Measured as the slope.

Assumes Linearity Constant residual variance (homoscedasticity) / residuals normal Errors are independent (i.e. not clustered)

Homogeneity of variance

Outputs “estimates” intercept slope standard errors t values p-values residual standard error (SSE – what is this?) R 2

Linear regression example: height vs. weight Extract information: > summary(lm(HW[,2] ~ HW[,1])) Call: lm(formula = HW[, 2] ~ HW[, 1]) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) HW[, 1] e-05 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 48 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 1 and 38 DF, p-value: 5.022e-05

Linear regression example: height vs. weight Extract information: > summary(lm(HW[,2] ~ HW[,1])) Call: lm(formula = HW[, 2] ~ HW[, 1]) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) HW[, 1] e-05 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 48 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 1 and 38 DF, p-value: 5.022e-05

Example Televisions, Physicians and Life Expectancy (World Almanac Factbook 1993) example – Residuals & Outliers – High leverage points & influential observations – Dummy variable coding – Transformations Take home messages – Regression is a very flexible tool – correlation ≠ causation

Dummy coding Creates an alternate variable that’s used for analysis For 2 categories you set values of … – reference level to 0 – level of interest to 1

Residuals and Outliers

High Leverage Points and Influential Observations