Product moment correlation Starter:
Product moment correlation Learning objectives: Understand the purpose of a scatter graph, the type of data it is used to represent and be able to describe what it shows using both mathematical and context- based vocabulary Know what the product moment correlation coefficient, 𝒓, represents and know how to calculate it from raw data Appreciate the limitation of 𝑟 when interpreting data
Average Temperature (oC) Scatter graphs Scatter Graphs enable us to examine the relationship between two variables, x and y. Scatter graphs are used with ‘bivariate data’ – this is data where we have two variables connected to one individual/object, hence ‘paired’ data. Average Temperature (oC) 13 16 18 21 14 25 11 24 15 27 20 19 Rainfall (mm) 40 36 43 44 28 50 17 39 7 What kind of graph is this? Which point has been incorrectly plotted? Why do we draw such a graph? Using Mathematical vocabulary, explain what the graph shows you. Relate your answer to part (d) to the context of the situation.
Product moment correlation Learning objectives: Understand the purpose of a scatter graph, the type of data it is used to represent and be able to describe what it shows using both mathematical and context- based vocabulary Know what the product moment correlation coefficient, 𝒓, represents and know how to calculate it from raw data Appreciate the limitation of 𝑟 when interpreting data
What does correlation mean? Correlation means there is a linear relationship between two variables – i.e. we can draw a line of best fit.
As height increases, weight increases. What does this scatter graph show about the relationship between the height and weight of twenty Year 10 boys? 40 45 50 55 60 140 150 160 170 180 190 Height (cm) Weight (kg) As height increases, weight increases. This is called a positive correlation.
What does this scatter graph show? 50 55 60 65 70 75 80 85 20 40 100 120 Number of cigarettes smoked in a week Life expectancy This data is fictional. However, some research does suggest links between smoking and a number of fatal diseases such as cancer. For further details, see the ASH website (www.ash.co.uk). It shows that life expectancy decreases as the number of cigarettes smoked increases. This is called a negative correlation.
What does correlation mean? Correlation means there is a linear relationship between two variables – i.e. we can draw a line of best fit. What types of correlation exist? Positive correlation: as one variable increases, so does the other variable Negative correlation: as one variable increases, the other variable decreases Zero correlation: no linear relationship between the variables
Comment on the two examples of negative correlation shown here. 50 55 60 65 70 75 80 85 20 40 100 120 Number of cigarettes smoked in a week Life expectancy
What does correlation mean? Correlation means there is a linear relationship between two variables – i.e. we can draw a line of best fit. What types of correlation exist? Positive correlation: as one variable increases, so does the other variable Negative correlation: as one variable increases, the other variable decreases Zero correlation: no linear relationship between the variables Correlation can be strong or weak
Correlation: issue to consider What kind of correlation is there? How strong is the correlation?
Product moment correlation Learning objectives: Understand the purpose of a scatter graph, the type of data it is used to represent and be able to describe what it shows using both mathematical and context- based vocabulary Know what the product moment correlation coefficient, 𝒓, represents and know how to calculate it from raw data Appreciate the limitation of 𝑟 when interpreting data
Product moment correlation coefficient This is a way to measure the strength of the correlation numerically. It is denoted by 𝒓 −𝟏≤𝒓≤𝟏 𝒓=𝟏 ⟶ perfect positive correlation 𝒓=𝟎 ⟶ zero correlation 𝒓=−𝟏 ⟶ perfect negative correlation
Product moment correlation coefficient Product moment correlation is calculated using the following formula… 𝒓= 𝑺 𝒙𝒚 𝑺 𝒙𝒙 × 𝑺 𝒚𝒚 Where: 𝑺 𝒙𝒙 = 𝒙− 𝒙 𝟐 = 𝒙 𝟐 −𝒏 𝒙 𝟐 𝑺 𝒚𝒚 = 𝒚− 𝒚 𝟐 = 𝒚 𝟐 −𝒏 𝒚 𝟐 𝑺 𝒙𝒚 = 𝒙− 𝒙 𝒚− 𝒚 = 𝒙𝒚 −𝒏 𝒙 𝒚
Using your calculator In the stat menu there is a very useful mode called ‘reg’. It can be used to calculate values 𝒂 and 𝒃 for calculating the equation of the least squares regression line. It can also calculate 𝒓 for us too!
Task Exercise A – Page 141 Questions 1 & 3
Limits of correlation: non-linear relationships Here, 𝑟=0.155 𝑟 measures linear relationships only! It is no use for analysing non-linear relationships. Note that clear non-linear relationships identified on scatter diagrams should always be commented upon but you should also note that the evaluation of r is not appropriate.
Limits of correlation: cause and effect Here, 𝑟=0.914 Does this mean stretching a child’s foot will make them better at maths? The correlation found between foot length and score in maths is often called SPURIOUS and should be treated with caution. Any suggestion that correlation may indicate cause and effect in the relationship between two variables should be considered very carefully!!!
Limits of correlation: ‘freak’ results 𝒓=𝟎 𝒓=𝟎.𝟕𝟏 An unusual result can drastically alter the value of r. Unexpected results (outliers) should be commented on and it may be best to exclude them from the analysis.
Task Exercise C – Page 144 All questions