Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Understand correlation analysis and relationships between variables –Draw and interpret a scatter diagram –Understand and calculate the product-moment correlation coefficient –Understand and calculate the rank correlation coefficient –Recognise spurious correlation –Test a correlation coefficient for significance Correlation Chapter S8
Slide 2 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 2 Correlation coefficient correlation analysis n The consideration of whether there is any relationship or association between two variables is called correlation analysis n Correlation coefficient n Correlation coefficient is the index which defines the strength or association between two variables
Slide 3 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 3 Dependent and independent variables n To determine if the value of one variable can be predicted from the value of the other. random sample –take a random sample –record a measurement for each individual in the sample –each individual has 2 data points bivariate –the data is said to be bivariate (consists of two variables ordered pairs –these data may be written as ordered pairs
Slide 4 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 4 Scatter diagrams A display in which ordered pairs of measurements are plotted on a coordinate axes system
Slide 5 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 5 Dependent variable The dependent variable is the one whose value is to be predicted. It is usually denoted by the letter y.
Slide 6 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 6 Independent variable in The independent variable is the one whose value is used to make the prediction. It is usually denoted by the letter x.
Slide 7 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 7 The Pearson product-moment correlation coefficient n Gives the numerical measure of the degree of association between two variables The value of the correlation coefficient calculated from a sample is denoted by the letter r
Slide 8 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 8 Positive and negative correlation y 1If two variables x and y are positively correlated this means that: –large values of x are associated with large values of y, and –small values of x are associated with small values of y 2If two variables x and y are negatively correlated this means that: –large values of x are associated with small values of y, and –small values of x are associated with large values of y
Slide 9 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 9 Positive correlation
Slide 10 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 10 Negative correlation
Slide 11 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 11 The Spearman rank correlation coefficient n An alternative measure of the degree of association between two variables. n Does not strictly measure the degree of association between the actual observations but rather the association between the ranks of the observation. Where: d = difference between corresponding pairs of rankings n = number of pairs of observations
Slide 12 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 12 Spurious correlation n If two variables are significantly correlated, this does not imply that one must be the cause of the other. n The degree of association is not directly proportional to the magnitude of the correlation coefficient. n The correlation coefficient is subject to variations in sampling.
Slide 13 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 13 Interpretation of the correlation coefficient n Method for determining whether an obtained correlation coefficient is significant. The formula for testing the significance of a value of r is: Where: n = number of pairs of observations
Slide 14 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 14 Testing value of r 1Assume that the two variables are uncorrelated. 2Calculate the correlation coefficient (r). 3Calculate the value of the expression. 4If | z | > 1.96 there is strong evidence to suggest that the assumption in Step 1 is incorrect and that a significant degree of correlation does exist. 5If | z | < 1.96 there is no strong reason to reject the assumption that the two variables are uncorrelated.
Slide 15 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 15 Testing a value of r s The steps for testing a value of r s for significance are: uncorrelated 1Assume that the two variables are uncorrelated 2Find the critical value of r s for the given value of n rejecteddoes exist 3If |r s | > critical value, the assumption in Step 1 is rejected and a significant relationship does exist between the two sets of rankings accepteddoes not exist 4If |r s | < critical value, the assumption in Step 1 is accepted and a significant relationship does not exist between the two sets of rankings
Slide 16 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 16 Summary n Correlation is a statistical technique that is often misused and misinterpreted. n The correlation coefficient gives an indication of the extent to which values of one variable are associated with values of the other. n Independent variables are always uncorrelated n Pearson product-moment correlation coefficient is essentially a measure of linear relationship n The Spearman rank correlation coefficient measures the extent to which the variables have the same ordering.