Spearman’s Rank For relationship data.

Spearman’s Rank For relationship data

Using Spearman’s Rank Non-parametric i.e. no assumptions are made about data fitting a normal distribution You must have more than 5 pairs of data (10+ better) Measures the strength and direction of the relationship between two variables

The value for rs (spearman rank) will be between +1 and -1
Positive correlation rs = +1 The value for rs (spearman rank) will be between +1 and -1 +1 indicates a perfect positive correlation -1 indicates a perfect negative correlation 0 indicates no correlation at all 0.5 Velocity (m/s) 1 2 3 Distance Downstream (km) 60 40 Discharge (cumecs) No correlation rs = 0 20 1 2 3 Number of passing dog walkers 30 20 Bedload particle size (cm) 10 Negative correlation rs = -1 1 2 3 Distance Downstream (km)

The Equation rs = 1 - 6  d2 n (n2 - 1)

The Equation rs = 1 - 6  d2 n (n2 - 1)
Where: rs = Spearman Rank Correlation Coefficient  d2 = Sum of the squared differences between ranks n = Number of pairs of observations in the sample

Method 1. Establish the Null Hypothesis H0 (this is always the negative form. i.e. there is no significant correlation between the variables) and the alternative hypothesis (H1). H0 - There is no significant correlation between variable X and variable Y H1 - There is a significant correlation between variable X and variable Y

Distance from source (km)
2. Copy your data into the table below as variable x and variable y and label the data sets Distance from source (km) Rank R1 PO4 ppm R2 d (R1 - R2) d2 (variable x) (variable y) 50 2 4 40 6 20 8 10

3. Rank the individual data sets in sets in increasing order as separate sets of data (i.e. Give the lowest data value the lowest rank) Distance from source (km) Rank R1 PO4 ppm R2 (variable x) (variable y) 50 2 4 40 6 20 8 10 Distance from source (km) Rank R1 PO4 ppm R2 (variable x) (variable y) 1 50 2 4 3 40 6 20 8 5 10 Take each variable in turn Lowest value gets a rank of 1 When you have data values that are the same, they must have the same rank If the next two data values were not the same we would be assigning ranks 5 and 6 5 + 6 = 11 so we will divide this rank equally between the data values (there are 2 data values so we divide 11 by 2) 11 / 2 = 5.5 so both the data values are assigned a rank of 5.5 The same thing is done for all data values that are the same

The assigned ranks should be recorded in the table Distance from source (km) Rank R1 PO4 ppm R2 d (R1 - R2) d2 (variable x) (variable y) 1 50 5.5 2 4 3 40 6 20 8 5 10

4. Calculate the difference between each pair of ranks R1-R2 (if done correctly the differences should equal zero) Take each variable in turn and record the differences in column d Distance from source (km) Rank R1 PO4 ppm R2 d (R1 - R2) d2 (variable x) (variable y) 1 50 5.5 2 4 3 40 6 20 8 5 10

5. Square the differences in column d Record in column d2 Distance from source (km) Rank R1 PO4 ppm R2 d (R1 - R2) d2 (variable x) (variable y) 1 50 5.5 -4.5 2 -3.5 4 3 40 -1 6 20 8 5 10

6. Calculate Sum of d2 Add up all the values in the d2 column Distance from source (km) Rank R1 PO4 ppm R2 d (R1 - R2) d2 (variable x) (variable y) 1 50 5.5 -4.5 21.25 2 -3.5 12.25 4 3 40 -1 6 20 8 5 10 9 25

Substitute the numbers calculated for the symbols in the equation
6. Calculate the rs value 6 x 69.5 = 417 d2 = 69.5 rs = 1 - 6  d2 n (n2 - 1) 6 x 35 = 210 62 -1 = 35 n =6 Substitute the numbers calculated for the symbols in the equation Work out each part in turn e.g. 1. Work out 6 x  d2 2. Work out n2 3. Work out n2 – 1 4. Work out n x answer to step 3 5. Work out answer to step one divided by the answer to step 4 6. Work out 1 – the answer to step 5 = 0.986 417/ 210 = 1.986

Is your rs value positive or negative?
If it is a positive number then you have a positive correlation If it is a negative number then you have a negative correlation (You do not need to worry about + and – for the next bit!)

Compare your rs value against the table of critical values
If rs is greater than or equal to the critical value, then there is a significant correlation and the null hypothesis can be rejected

Critical values for Spearman’s Rank Correlation Coefficient
Significance level Number of pairs of measurements (n) p = 0.05 (95%) (+ or -) p = 0.01 (99%) 5 1.000 6 0.886 7 0.786 0.929 8 0.738 0.881 9 0.683 0.833 10 0.648 0.818 11 0.623 0.794 12 0.591 0.780 13 0.566 0.745 14 0.545 0.716 15 0.525 0.689 16 0.507 0.666 17 0.490 0.645 18 0.476 0.625 19 0.462 0.608 20 0.450

Check the P-0.05 (95%) confidence level first
This means we are 95% confident our results were not due to chance If we have a significant correlation at 95% we can go back and check if we have a significant correlation at 99% as well (so we can be 99% confident our results were not due to chance)

Is our rs value smaller or larger than our critical value from the critical value table?
If the rs value is greater than or equal to the critical value then the null hypothesis can be rejected – There is a significant correlation If the rs value is NOT greater than or equal to the critical value then the null hypothesis cannot be rejected – There is no significant correlation

Use the following data to calculate rs independently
Light (Lux) Rank R1 Hedera helix leaf area cm2 R2 d (R1 - R2) d2 (variable x) (variable y) 1165 22.8 980 24.7 26.8 700 37.5 760 500 495 41.3 366 44.6 348 78.3 298 58.0

Key questions Is there a significant correlation?
Which data value/s would you consider to be anomalous and why? Which graph would you use to present this data?

Spearman’s Rank For relationship data.

Similar presentations

Presentation on theme: "Spearman’s Rank For relationship data."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Spearman’s Rank For relationship data.

Similar presentations

Presentation on theme: "Spearman’s Rank For relationship data."— Presentation transcript:

Similar presentations

About project

Feedback