Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation – Pearson’s. What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1.

Similar presentations


Presentation on theme: "Correlation – Pearson’s. What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1."— Presentation transcript:

1 Correlation – Pearson’s

2 What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1 and 1 Perfect negative correlation +1 0 No correlation Perfect positive correlation

3 Planning to use it? You have continuous data (eg lengths, weights…) – it isn’t valid otherwise You have at least 5 data pairs (more is better) You want to use Pearson’s rather than rank correlation – does the scatter diagram look close to a straight line? Make sure that…

4 How does it work? You assume (null hypothesis) there is no correlation The test involves calculating totals from your data and substituting into a formula. This works out how far off a straight line your points are The calculation can be done automatically on a spreadsheet, and on many graphic calculators

5 Doing the test These are the stages in doing the test: 1.Write down your hypotheseshypotheses 2.Work out the totals needed for the formulatotals 3.Use the formula to get a value for the correlationformula 4.Look at the tablestables 5.Make a decisiondecision Click here Click here for an example Click here Click here to find out how to calculate a best-fit line

6 Hypotheses H 0: r = 0 (there is no correlation) For H 1, you have a choice, depending on what alternative you were looking for. H 1: r > 0 (positive correlation) orH 1: r < 0 (negative correlation) orH 0: r  0 (some correlation) If you have a good scientific reason for expecting a particular kind of correlation, use one of the first two. If not, use the r  0

7 Totals Get your data in table form like this, and complete the extra columns shown xyx 2 y 2 xy 1 51255 2744914 46163624 6113612166 Total each column. This gives you  x,  y,  x 2,  y 2, and  xy

8 Formula n = number of data pairs  x = sum of x-values,  y = sum of y values etc

9 Tables This is a Pearson’s correlation coefficient table This is your number of pairs These are your significance levels eg 0.05 = 5%

10 Make a decision If your value is bigger than the tables value (ignoring signs), then you can reject the null hypothesis. Otherwise you must accept it. Make sure you choose the right tables value – it depends whether your test is 1 or 2 tailed:  If you are using H 1 : r > 0 or H 1 : r < 0, you are doing a 1-tailed test  If you are using H 1 : r  0, you are doing a 2-tailed test

11 Soil Salinity & Plant Height The data below were collected on soil salinity and plant height. Hypotheses: H 0: r = 0 (no correlation) H 1 r  0 (some correlation)

12 Totals Soil Salinity (x)2812151625 Plant Height (y)104040527548 x 2 784144225256425 y 2 10016001600270456252304 xy280480600832150240  x = 78  y = 265  x 2 = 1438  y 2 = 13933  xy = 2582 NB: You HAVE to work out  y 2 by squaring all the values and adding up. You CAN’T work out the sum of y, then square.

13 Formula We now put all the totals into the formula: Click here Click here for some hints on working this out on a calculator

14 Pearson’s on the Calculator First check if the calculator is “scientific” – that is, it automatically does multiplication before addition Try 2 + 4  3. If you get 14, it does multiplication 1 st If you get 18, it doesn’t Work out the top of the fraction.  For a scientific calculator, put it in exactly as shown ((78)(65) means 78  65)  For a non-scientific calculator, put in brackets 2582 – (1/6  78  65) (-863) Work out each part of the bottom of the fraction.  Non-scientific calculator: 1438 - (1/6  (78 2 )) (424, 2228.833) Multiply the two parts from the bottom together (945025.333) Take the square root of previous answer – keep answer in memory (972.124) Divide top of fraction by previous answer

15 The test We have used H 1 r  0 – so it is a 2-tailed test Tables value (5% level): 0.8114 Our value: -0.8878 So we can reject H 0 – there is some correlation

16 Calculating a Best-Fit Line If Pearson’s is significant, then it’s valid to calculate a best fit (regression) line The line has equation y = a + bx where a and b can be calculated This lets you make predictions of the height of a plant given the soil salinity, by putting values of x into the equation

17 Finding the Line The line has equation y = a + bx So for the soil salinity, the line is: So the equation is: y = 70.622 – 2.035x


Download ppt "Correlation – Pearson’s. What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1."

Similar presentations


Ads by Google