Download presentation
Presentation is loading. Please wait.
Published byRandolph Washington Modified over 9 years ago
1
Correlation
2
The statistic: Definition is called Pearsons correlation coefficient
3
1.-1 ≤ r ≤ 1, |r| ≤ 1, r 2 ≤ 1 2.|r| = 1 (r = +1 or -1) if the points (x 1, y 1 ), (x 2, y 2 ), …, (x n, y n ) lie along a straight line. (positive slope for +1, negative slope for -1) Properties
4
Proof Uses the Cauchy-Schwarz inequality
5
Let then and if v i = bu i for some b and i = 1, 2, …, n. Cauchy-Schwarz Inequality
6
Let then This is a quadratic function of b and has a minimum when Proof:
7
or hence
8
Thus and i.e. v i = b min u i for i = 1, 2, …, n. if
9
Finally or i.e.
10
Also i.e. if and only if or
11
Note: and
12
Properties of Pearson’s correlation coefficient r 1.The value of r is always between –1 and +1. 2.If the relationship between X and Y is positive, then r will be positive. 3.If the relationship between X and Y is negative, then r will be negative. 4.If there is no relationship between X and Y, then r will be zero. 5.The value of r will be +1 if the points, ( x i, y i ) lie on a straight line with positive slope. 6.The value of r will be +1 if the points, ( x i, y i ) lie on a straight line with positive slope.
13
r =1
14
r = 0.95
15
r = 0.7
16
r = 0.4
17
r = 0
18
r = -0.4
19
r = -0.7
20
r = -0.8
21
r = -0.95
22
r = -1
23
The test for independence (zero correlation) The test statistic: Reject H 0 if |t| > t a/2 (df = n – 2) H 0 : X and Y are independent H A : X and Y are correlated The Critical region This is a two-tailed critical region, the critical region could also be one-tailed
24
Example In this example we are studying building fires in a city and interested in the relationship between: 1. X = the distance of the closest fire hall and the building that puts out the alarm and 2. Y = cost of the damage (1000$) The data was collected on n = 15 fires.
25
The Data
26
Scatter Plot
27
Computations
28
Computations Continued
30
The correlation coefficient The test for independence (zero correlation) The test statistic: We reject H 0 : independence, if |t| > t 0.025 = 2.160 H 0 : independence, is rejected
31
Relationship between Regression and Correlation
32
Recall and since
33
The test for independence (zero correlation) Uses the test statistic: H 0 : X and Y are independent H A : X and Y are correlated Note: and
34
1.The test for independence (zero correlation) H 0 : X and Y are independent H A : X and Y are correlated are equivalent The two tests 2.The test for zero slope H 0 : = 0. H A : ≠ 0
35
The Coefficient of Determination
36
The Residual Sum of Squares in Regression Note:
37
Proof Total Variance in Y = Variance Unexplained +Variance Explained
38
Proportion of Variance Unexplained = Proportion of Variance Explained = 1 - Proportion of Variance Unexplained = r 2 r 2 is called the Coefficient of Determination
39
92.3% = Proportion of Variance in Y (Cost of Damage) explained by X (distance to closes fire hall). Proportion of Variance Unexplained = 1 - r 2 r = 0.961 Example: Fire Example r 2 = the Coefficient of Determination = 0.961 2 = 0.923 = 1 - 0.923 = 0.077 (7.7%)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.