Correlation & Regression. Correlation Measure the strength of linear relation between 2 random variables (X & Y)Measure the strength of linear relation.

Slides:



Advertisements
Similar presentations
Coefficient of Determination- R²
Advertisements

Chapter 12 Simple Linear Regression
Forecasting Using the Simple Linear Regression Model and Correlation
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
EPI 809/Spring Probability Distribution of Random Error.
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship.
Chapter 12 Simple Linear Regression
Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)
Chapter 10 Simple Regression.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
SIMPLE LINEAR REGRESSION
ASSESSING THE STRENGTH OF THE REGRESSION MODEL. Assessing the Model’s Strength Although the best straight line through a set of points may have been found.
Chapter Topics Types of Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Introduction to Probability and Statistics Linear Regression and Correlation.
Regression Analysis In regression analysis we analyze the relationship between two or more variables. The relationship between two or more variables could.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Simple Linear Regression and Correlation
Lecture 5 Correlation and Regression
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)
Simple Linear Regression Models
Learning Objective Chapter 14 Correlation and Regression Analysis CHAPTER fourteen Correlation and Regression Analysis Copyright © 2000 by John Wiley &
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Further Topics in Regression Analysis Objectives: By the end of this section, I will be able to… 1) Explain prediction error, calculate SSE, and.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Elementary Statistics Correlation and Regression.
Relationship between two variables Two quantitative variables: correlation and regression methods Two qualitative variables: contingency table methods.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Session 13: Correlation (Zar, Chapter 19). (1)Regression vs. correlation Regression: R 2 is the proportion that the model explains of the variability.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Chapter Thirteen Bivariate Correlation and Regression Chapter Thirteen.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 13 Simple Linear Regression
Simple Linear Regression & Correlation
Linear Regression and Correlation Analysis
Correlation and Regression
Relationship with one independent variable
CHAPTER fourteen Correlation and Regression Analysis
Quantitative Methods Simple Regression.
Review of Chapter 3 where Multiple Linear Regression Model:
6-1 Introduction To Empirical Models
Chapter 14 – Correlation and Simple Regression
Simple Linear Regression
PENGOLAHAN DAN PENYAJIAN
Review of Chapter 2 Some Basic Concepts: Sample center
Relationship with one independent variable
Correlation and Regression
Linear Regression and Correlation
Simple Linear Regression
Linear Regression and Correlation
Introduction to Regression
St. Edward’s University
Presentation transcript:

Correlation & Regression

Correlation Measure the strength of linear relation between 2 random variables (X & Y)Measure the strength of linear relation between 2 random variables (X & Y)  = Corr(X,Y) = Cov(X,Y)/δxδy  = Corr(X,Y) = Cov(X,Y)/δxδy = E[(X-μ x )(Y-μ y )]/[(X- μ x ) 2 (Y- μ y ) 2 ] 1/2 = E[(X-μ x )(Y-μ y )]/[(X- μ x ) 2 (Y- μ y ) 2 ] 1/2 Standardized Cov(X,Y) so -1    1Standardized Cov(X,Y) so -1    1

Strength of   = -1  Perfect Negative linear relation  = -1  Perfect Negative linear relation  = 1  Perfect Positive linear relation  = 1  Perfect Positive linear relation  = 0  No linear relation  = 0  No linear relation As |  | increases so does the strength of the relationshipAs |  | increases so does the strength of the relationship

Sample Cov(X,Y) = 1/(n-1)  (x i -  x)(y i -  y)Cov(X,Y) = 1/(n-1)  (x i -  x)(y i -  y) Corr(X,Y) = r =Corr(X,Y) = r =  (x i -  x)(y i -  y)/[  (x i -  x) 2  (y i -  y) 2 ] 1/2  (x i -  x)(y i -  y)/[  (x i -  x) 2  (y i -  y) 2 ] 1/2

Hypothesis Test Null: H 0 :  = 0Null: H 0 :  = 0 Alternative: H A :   0; reject H 0 if t>t n-2,  /2Alternative: H A :   0; reject H 0 if t>t n-2,  /2 Alternative: H A :  > 0; reject H 0 if t > t n-2, Alternative: H A :  > 0; reject H 0 if t > t n-2,  Alternative: H A :  < 0; reject H 0 if t < -t n-2, Alternative: H A :  < 0; reject H 0 if t < -t n-2, 

Rank Correlation (Spearman’s) Sample Correlation (r) can be affected by extreme observationsSample Correlation (r) can be affected by extreme observations Spearman’s RankSpearman’s Rank –1 st rank xi and yi then calculate sample correlation of these ranks –r s = 1- [6(  d 2 )/n(n 2 -1)] –Where d i = the differences of the ranked pairs

Linear Regression Find/Define relationship between dependent variable and independent variableFind/Define relationship between dependent variable and independent variable Use independent variable to explain the behavior of the dependent variableUse independent variable to explain the behavior of the dependent variable Separate variation in the data into explained variation and unexplained variation (noise)Separate variation in the data into explained variation and unexplained variation (noise) Predict the value of the dependent variable given a value for the independent variablePredict the value of the dependent variable given a value for the independent variable

Linear Regression Model Predict Y given XPredict Y given X E(Y|X=x) =  0 +  1 xE(Y|X=x) =  0 +  1 x Y =  0 +  1 x i +  iY =  0 +  1 x i +  i Assumptions:Assumptions: –  I are random variables –E[  i ] = 0 –E[  i  i ] = δ 2 –E[  i  k ] = 0 i  k; they are uncorrelated

Sum of Squares Total Sum of Squares =Total Sum of Squares = Regression sum of squares + Error sum of squares SST = SSR + SSESST = SSR + SSE  (y i -  y) 2 =  (y  i -  y) 2 +  e 2 i  (y i -  y) 2 =  (y  i -  y) 2 +  e 2 i

Coefficient of Determination (R 2 ) Measures how well x explain the variation in YMeasures how well x explain the variation in Y R 2 = SSR/SST = 1- SSE/SST = r 2R 2 = SSR/SST = 1- SSE/SST = r 2 R 2 measures the explained variation in the dataR 2 measures the explained variation in the data

Confidence Interval Error Variance: S 2 e =  e 2 i /(n-2) = SSE/(n-2)Error Variance: S 2 e =  e 2 i /(n-2) = SSE/(n-2) Unbiased Estimate of δ 2 b : S 2 b = S 2 e /  (x i -  x) 2Unbiased Estimate of δ 2 b : S 2 b = S 2 e /  (x i -  x) 2 t = (b-  )/S bt = (b-  )/S b C.I. for Regression Slope =C.I. for Regression Slope = b-t n-2,  /2 S b <  < b+t n-2,  /2 S b b-t n-2,  /2 S b <  < b+t n-2,  /2 S b

Regression Slope Tests H 0 :  =  0 or H 0 :    0 vs. H 1 :  >  0H 0 :  =  0 or H 0 :    0 vs. H 1 :  >  0 Reject H 0 if (b-  )/S b > t n-2, Reject H 0 if (b-  )/S b > t n-2,  H 0 :  =  0 or H 0 :    0 vs. H 1 :  <  0H 0 :  =  0 or H 0 :    0 vs. H 1 :  <  0 Reject H 0 if (b-  )/S b < -t n-2, Reject H 0 if (b-  )/S b < -t n-2,  H 0 :  =  0 vs. H 1 :    0H 0 :  =  0 vs. H 1 :    0 Reject H 0 if (b-  )/S b > t n-2,  or (b-  )/S b t n-2,  or (b-  )/S b < -t n-2, 

SAS: Inches-Centimeter Data Height;Data Height; Input inches centimeter;Input inches centimeter; Datalines;Datalines; ; Proc Plot Data=Height;Proc Plot Data=Height; Plot inches*centimeter;Plot inches*centimeter; Proc Corr Data=Height;Proc Corr Data=Height; Title 'Correlation Matrix of Inches vs. Centimeter';Title 'Correlation Matrix of Inches vs. Centimeter'; Var inches centimeter;Var inches centimeter; Proc Reg Data=Height;Proc Reg Data=Height; Title 'Regression Line for Inches-Centimeter Data';Title 'Regression Line for Inches-Centimeter Data'; Model inches=centimeter;Model inches=centimeter; Plot Predicted.*centimeter = 'P'Plot Predicted.*centimeter = 'P' U95M.*centimeter = '-' L95M.*centimeter = '_'U95M.*centimeter = '-' L95M.*centimeter = '_' inches*centimeter = '*' / overlay;inches*centimeter = '*' / overlay; Plot Residual.*centimeter = 'o';Plot Residual.*centimeter = 'o'; Quit;Quit;

SAS: GRE – GPA Data Data GRE_GPA;Data GRE_GPA; Input GRE GPA;Input GRE GPA; Datalines;Datalines; ; Proc Plot Data=GRE_GPA;Proc Plot Data=GRE_GPA; Plot GRE*GPA;Plot GRE*GPA; Proc Corr Data=GRE_GPA;Proc Corr Data=GRE_GPA; Title 'Correlation Matrix of GRE vs. GPA';Title 'Correlation Matrix of GRE vs. GPA'; Var GRE GPA;Var GRE GPA; Proc Reg Data=GRE_GPA;Proc Reg Data=GRE_GPA; Title 'Regression Line for GRE-GPA Data';Title 'Regression Line for GRE-GPA Data'; Model GPA=GRE;Model GPA=GRE; Plot Predicted.*GRE = 'P'Plot Predicted.*GRE = 'P' U95M.*GRE = '-' L95M.*GRE = '_'U95M.*GRE = '-' L95M.*GRE = '_' GPA*GRE = '*' / overlay;GPA*GRE = '*' / overlay; Plot Residual.*GRE = 'o';Plot Residual.*GRE = 'o'; Quit;Quit;