Download presentation
Presentation is loading. Please wait.
1
Correlation
2
Definition The linear correlation coefficient r measures the strength of the linear relationship between paired x- and y- quantitative values in a sample. We can often see a relationship between two variables by constructing a scatterplot. page 518 of Elementary Statistics, 10th Edition
3
Scatterplots of Paired Data
Page 519 of Elementary Statistics, 10th Edition
4
Scatterplots of Paired Data
Page 519 of Elementary Statistics, 10th Edition
5
Requirements 1. The sample of paired (x, y) data is a random sample.
2. Visual examination of the scatter plot must confirm that the points approximate a certain pattern. 3. The outliers must be removed if they are known to be errors. page 520 of Elementary Statistics, 10th Edition Explain to students the difference between the ‘paired’ data of this chapter and the investigation of two groups of data in Chapter 9.
6
Notation for the Linear Correlation Coefficient
n represents the number of pairs of data present. denotes the addition of the items indicated. x denotes the sum of all x-values. x2 indicates that each x-value should be squared and then those squares added. (x)2 indicates that the x-values should be added and the total then squared. xy indicates that each x-value should be first multiplied by its corresponding y-value. After obtaining all such products, find their sum. r represents linear correlation coefficient for a sample. represents linear correlation coefficient for a population.
7
Formula r = nxy – (x)(y) n(x2) – (x)2 n(y2) – (y)2
The linear correlation coefficient r measures the strength of a linear relationship between the paired values in a sample. nxy – (x)(y) n(x2) – (x) n(y2) – (y)2 r =
8
Example: Calculating r
Using the simple random sample of data below, find the value of r. 3 5 1 8 6 4 Data x y page 521 of Elementary Statistics, 10th Edition.
9
Example: Calculating r - cont
page 522 of Elementary Statistics, 10th Edition.
10
Example: Calculating r - cont
3 5 1 8 6 4 Data x y nxy – (x)(y) n(x2) – (x) n(y2) – (y)2 r = 4(61) – (12)(23) 4(44) – (12) (141) – (23)2 -32 33.466 = Page 522 of Elementary Statistics, 10th Edition
11
Properties of the Linear Correlation Coefficient r
2. The value of r does not change if all values of either variable are converted to a different scale. 3. The value of r is not affected by the choice of x and y. Interchange all x- and y-values and the value of r will not change. 4. r measures strength of a linear relationship. page 524 of Elementary Statistics, 10th Edition If using a graphics calculator for demonstration, it will be an easy exercise to switch the x and y values to show that the value of r will not change.
12
Interpreting r : Explained Variation
The value of r2 is the proportion of the variation in y that is explained by the linear relationship between x and y. For Example if r = 0.926, we get r2 = We conclude that (or about 86%) of the variation in Y can be explained by the linear relationship between X and Y. This implies that 14% of the variation in Y cannot be explained by X page 524 of Elementary Statistics, 10th Edition If using a graphics calculator for demonstration, it will be an easy exercise to switch the x and y values to show that the value of r will not change.
13
Formal Hypothesis Test
We wish to determine whether there is a significant linear correlation between two variables. H0: = (no significant linear correlation) H1: (significant linear correlation) page 527 of Elementary Statistics, 10th Edition
14
Test Statistic is t t = Use Tables with degrees of freedom = n – 2
Critical values: Use Tables with degrees of freedom = n – 2 This is the first example in the text where the degrees of freedom for Table A-3 is different from n Special note should be made of this.
15
P-value: Conclusion: Use Tables with degrees of freedom = n – 2
If the absolute value of t is > critical value reject H0 and conclude that there is a linear correlation. If the absolute value of t ≤ critical value, fail to reject H0; there is not sufficient evidence to conclude that there is a linear correlation.
16
(follows format of earlier chapters)
Test Statistic is t (follows format of earlier chapters) This is the drawing used to verify the position of the sample data t value in regard to the critical t values for the example which begins on page 527 of Elementary Statistics, 10th Edition. Drawing is at the top of page 528.
17
Covariance Measure of linear relationship between variables
If the relationship between the random variables is nonlinear, the covariance might not be sensitive to the relationship
18
Pearson’s Correlation Coeff.
Pearson's correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations: The above formula defines the population correlation coefficient, commonly represented by the Greek letter ρ (rho). Substituting estimates of the covariances and variances based on a sample gives the sample correlation coefficient, commonly denoted r :
19
Pearson correlation coefficient
The Spearman correlation coefficient is often thought of as being the Pearson correlation coefficient between the ranked variables. In practice, however, a simpler procedure is normally used to calculate ρ. The n raw scores Xi, Yi are converted to ranks xi, yi, and the differences di = xi − yi between the ranks of each observation on the two variables are calculated Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
20
A Spearman correlation of 1 results when the two variables being compared are monotonically related, even if their relationship is not linear. In contrast, this does not give a perfect Pearson correlation Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
21
When the data are roughly elliptically distributed and there are no prominent outliers, the Spearman correlation and Pearson correlation give similar values Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
22
The Spearman correlation is less sensitive than the Pearson correlation to strong outliers that are in the tails of both samples Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.