Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation. The statistic: Definition is called Pearsons correlation coefficient.

Similar presentations


Presentation on theme: "Correlation. The statistic: Definition is called Pearsons correlation coefficient."— Presentation transcript:

1 Correlation

2 The statistic: Definition is called Pearsons correlation coefficient

3 1.-1 ≤ r ≤ 1, |r| ≤ 1, r 2 ≤ 1 2.|r| = 1 (r = +1 or -1) if the points (x 1, y 1 ), (x 2, y 2 ), …, (x n, y n ) lie along a straight line. (positive slope for +1, negative slope for -1) Properties

4 Proof Uses the Cauchy-Schwarz inequality

5 Let then and if v i = bu i for some b and i = 1, 2, …, n. Cauchy-Schwarz Inequality

6 Let then This is a quadratic function of b and has a minimum when Proof:

7 or hence

8 Thus and i.e. v i = b min u i for i = 1, 2, …, n. if

9 Finally or i.e.

10 Also i.e. if and only if or

11 Note: and

12 Properties of Pearson’s correlation coefficient r 1.The value of r is always between –1 and +1. 2.If the relationship between X and Y is positive, then r will be positive. 3.If the relationship between X and Y is negative, then r will be negative. 4.If there is no relationship between X and Y, then r will be zero. 5.The value of r will be +1 if the points, ( x i, y i ) lie on a straight line with positive slope. 6.The value of r will be +1 if the points, ( x i, y i ) lie on a straight line with positive slope.

13 r =1

14 r = 0.95

15 r = 0.7

16 r = 0.4

17 r = 0

18 r = -0.4

19 r = -0.7

20 r = -0.8

21 r = -0.95

22 r = -1

23 The test for independence (zero correlation) The test statistic: Reject H 0 if |t| > t a/2 (df = n – 2) H 0 : X and Y are independent H A : X and Y are correlated The Critical region This is a two-tailed critical region, the critical region could also be one-tailed

24 Example In this example we are studying building fires in a city and interested in the relationship between: 1. X = the distance of the closest fire hall and the building that puts out the alarm and 2. Y = cost of the damage (1000$) The data was collected on n = 15 fires.

25 The Data

26 Scatter Plot

27 Computations

28 Computations Continued

29

30 The correlation coefficient The test for independence (zero correlation) The test statistic: We reject H 0 : independence, if |t| > t 0.025 = 2.160 H 0 : independence, is rejected

31 Relationship between Regression and Correlation

32 Recall and since

33 The test for independence (zero correlation) Uses the test statistic: H 0 : X and Y are independent H A : X and Y are correlated Note: and

34 1.The test for independence (zero correlation) H 0 : X and Y are independent H A : X and Y are correlated are equivalent The two tests 2.The test for zero slope H 0 :  = 0. H A :  ≠ 0

35 The Coefficient of Determination

36 The Residual Sum of Squares in Regression Note:

37 Proof Total Variance in Y = Variance Unexplained +Variance Explained

38 Proportion of Variance Unexplained = Proportion of Variance Explained = 1 - Proportion of Variance Unexplained = r 2 r 2 is called the Coefficient of Determination

39 92.3% = Proportion of Variance in Y (Cost of Damage) explained by X (distance to closes fire hall). Proportion of Variance Unexplained = 1 - r 2 r = 0.961 Example: Fire Example r 2 = the Coefficient of Determination = 0.961 2 = 0.923 = 1 - 0.923 = 0.077 (7.7%)


Download ppt "Correlation. The statistic: Definition is called Pearsons correlation coefficient."

Similar presentations


Ads by Google