Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and co-variance of variables We will see how to numerically measure the strength of correlation between two variables
Correlation Scatter Graphs Scatter Graphs are a way of representing 2 sets of data. It is then possible to see whether they are related. Positive Correlation As one variable increases, so does the other Negative Correlation As one variable increases, the other decreases No Correlation There seems to be no pattern linking the two variables Positive Negative None 6A
Correlation Scatter Graphs In the study of a city, the population density, in people/hectare, and the distance from the city centre, in km, was investigated by choosing sample areas. The results are as follows: Plot a scatter graph and describe the correlation. Interpret what the correlation means. AreaABCDE Distance Pop. Density AreaFGHIJ Distance Pop. Density Distance from centre (km) Pop. Density (people/hectare) The correlation is negative, which means that as we get further from the city centre, the population density decreases.
Correlation Variability of Bivariate Data We learnt in chapter 3 that: In Correlation: Similarly for y: And you can also calculate the Co-variance of both variables 6B/C (Although remember that this formula changed to make it easier to use) ‘How x varies’ ‘How y varies’ ‘How x and y vary together’
Correlation Variability of Bivariate Data Like in chapter 3, we can use a formula which will make calculations easier BUT: 6B/C
Correlation Variability of Bivariate Data Multiply both sides by ‘n’ The easier formula for variance from chapter 3 For the second fraction, square the top and bottom separately Variability of Bivariate Data Multiplying both fractions by ‘n’ will cancel a ‘divide by n’ from each of them 6B/C
Correlation Variability of Bivariate Data These are the formulae for S xx, S yy and S xy. You are given these in the formula booklet. You do not need to know how to derive them (like we just did!) 6B/C
Correlation Variability of Bivariate Data Calculate Sxx, Syy and Sxy, based on the following information. 6B/C
Correlation Variability of Bivariate Data The following table shows babies heads’ circumferences (cm) and the gestation period (weeks) for 6 new born babies. Calculate S xx, S yy and S xy. We need xy y2y x2x Gestation period (y) Head size (x) FEDCBABaby 6B/C
Correlation Variability of Bivariate Data The following table shows babies heads’ circumferences (cm) and the gestation period (weeks) for 6 new born babies. Calculate S xx, S yy and S xy. We need 6B/C
Correlation Product Moment Correlation Coefficient We can test the correlation of data by calculating the Product Moment Correlation Coefficient. This uses S xx, S yy and S xy. The value of this number tells you what the correlation is and how strong it is. The closer to 1, the stronger the positive correlation. The same applies for -1 and negative correlation. A value close to 0 implies no linear correlation. Positive Correlation Negative Correlation 10 No Linear Correlation 6B/C
Correlation Product Moment Correlation Coefficient Given the following data, calculate the Product Moment Correlation Coefficient. There is positive correlation, as x increases, y does as well. 6B/C
Correlation Limitations of the Product Moment Correlation Coefficient Sometimes it may indicate Correlation between unrelated variables Cars on a particular street have increased, as have the sales of DVDs in town The PMCC would indicate positive correlation where the two are most likely not linked The speed of computers has increased, as has life expectancy amongst people These are not directly linked, but are both due to scientific developments 6B/C
Correlation Using Coding with the PMCC Calculating the PMCC from this table. 6D xy y2y x2x y x
Correlation Using Coding with the PMCC Calculating the PMCC from this table xy y2y x2x y x 6D
Correlation Using Coding with the PMCC Calculating the PMCC from this table, using coding. 6D pq q2q p2p q p y x
Correlation Using Coding with the PMCC Calculating the PMCC from this table. So coding will not affect the PMCC! pq q2q p2p q p y x 6D
Summary We have looked at plotting scatter graphs We have looked at calculating measures of variance, S xx, S yy and S xy We have also seen types of correlation and how to recognise them on a graph We have calculated the Product Moment Correlation Coefficient, and interpreted it. It is a numerical measure of correlation.