Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)

Similar presentations


Presentation on theme: "Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)"— Presentation transcript:

1 Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)

2 Goals ⇨ Introduce concepts of  Covariance  Correlation ⇨ Develop computational formulas 2R F Riesenfeld Sp 2010CS5961 Comp Stat

3 Covariance ⇨ Variables may change in relation to each other ⇨ measures how much the movement in one variable predicts the movement in a corresponding variable ⇨ Covariance measures how much the movement in one variable predicts the movement in a corresponding variable 3R F Riesenfeld Sp 2010CS5961 Comp Stat

4 Smoking and Lung Capacity ⇨ Example: investigate relationship between and ⇨ Example: investigate relationship between cigarette smoking and lung capacity ⇨ Data: sample group response data on smoking habits, measured lung capacities, respectively ⇨ Data: sample group response data on smoking habits, and measured lung capacities, respectively R F Riesenfeld Sp 2010CS5961 Comp Stat4

5 Smoking v Lung Capacity Data 5 N Cigarettes ( X )Lung Capacity ( Y ) 1045 2542 31033 41531 52029 R F Riesenfeld Sp 2010CS5961 Comp Stat

6 Smoking and Lung Capacity 6

7 Smoking v Lung Capacity ⇨ Observe that as smoking exposure goes up, corresponding lung capacity goes down ⇨ Variables inversely ⇨ Variables covary inversely ⇨ and quantify relationship ⇨ Covariance and Correlation quantify relationship 7R F Riesenfeld Sp 2010CS5961 Comp Stat

8 Covariance ⇨ Variables that inversely, like smoking and lung capacity, tend to appear on opposite sides of the group means ⇨ Variables that covary inversely, like smoking and lung capacity, tend to appear on opposite sides of the group means  When smoking is above its group mean, lung capacity tends to be below its group mean. ⇨ Average measures extent to which variables covary, the degree of linkage between them ⇨ Average product of deviation measures extent to which variables covary, the degree of linkage between them 8R F Riesenfeld Sp 2010CS5961 Comp Stat

9 The Sample Covariance ⇨ Similar to variance, for theoretical reasons, average is typically computed using ( N -1 ), not N. Thus, 9R F Riesenfeld Sp 2010CS5961 Comp Stat

10 Cigs ( X )Lung Cap ( Y ) 045 542 1033 1531 2029 1036 Calculating Covariance R F Riesenfeld Sp 2010CS5961 Comp Stat10

11 Calculating Covariance Cigs (X ) Cap (Y ) 0-10-90945 5-5-30642 1000-333 155-25-531 2010-70-729 ∑= - 215 R F Riesenfeld Sp 2010CS5961 Comp Stat11

12 Evaluation yields, Evaluation yields, 12 Covariance Calculation (2) R F Riesenfeld Sp 2010CS5961 Comp Stat

13 Covariance under Affine Transformation 13R F Riesenfeld Sp 2010CS5961 Comp Stat

14 14 CovarianceunderAffineTransf Covariance under Affine Transf (2) R F Riesenfeld Sp 2010CS5961 Comp Stat

15 (Pearson) Correlation Coefficient r xy ⇨ Like covariance, but uses Z-values instead of deviations. Hence, invariant under linear transformation of the raw data. 15R F Riesenfeld Sp 2010CS5961 Comp Stat

16 Alternative (common) Expression 16R F Riesenfeld Sp 2010CS5961 Comp Stat

17 Computational Formula 1 17R F Riesenfeld Sp 2010CS5961 Comp Stat

18 Computational Formula 2 18R F Riesenfeld Sp 2010CS5961 Comp Stat

19 Table for Calculating r xy 19 Cigs ( X ) X 2 XYY 2 Cap (Y ) 0 0 0202545 5 25210176442 10100330108933 1522546596131 2040058084129 ∑= 5075015856680180 R F Riesenfeld Sp 2010CS5961 Comp Stat

20 Computing from Table Computing r xy from Table 20R F Riesenfeld Sp 2010CS5961 Comp Stat

21 Computing Correlation 21R F Riesenfeld Sp 2010CS5961 Comp Stat

22 Conclusion ⇨ r xy = -0.96 implies almost certainty smoker will have diminish lung capacity ⇨ Greater smoking exposure implies greater likelihood of lung damage 22R F Riesenfeld Sp 2010CS5961 Comp Stat

23 End Covariance & Correlation 23 Notes R F Riesenfeld Sp 2010CS5961 Comp Stat


Download ppt "Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)"

Similar presentations


Ads by Google