Calculating Spearman's rank correlation coefficient (Edexcel GCSE Statistics): Revision Notes
Calculating Spearman's rank correlation coefficient
What is Spearman's rank correlation coefficient?
Spearman's rank correlation coefficient (written as rs) measures the strength of the relationship between two sets of data using their rankings rather than their actual values. This statistical tool helps us understand how closely two variables are related when we arrange the data in order from smallest to largest.
The coefficient always produces a value between -1 and +1, where:
- +1 indicates a perfect positive correlation (as one variable increases, the other increases)
- -1 indicates a perfect negative correlation (as one variable increases, the other decreases)
- 0 indicates no correlation between the variables
The Spearman's formula
The formula for calculating Spearman's rank correlation coefficient is:
rs = 1 - (6Σd²)/(n(n² - 1))
Where:
- rs = Spearman's rank correlation coefficient
- Σd² = the sum of all the squared differences between ranks
- n = the number of pairs of data
- d = the difference between each pair of ranks
Step-by-step calculation process
Step 1: Rank your data
First, you need to convert your raw data into rankings. Arrange each set of data from smallest to largest and assign ranks (1st, 2nd, 3rd, etc.). If you have tied values, give them the average of the ranks they would have occupied.
Step 2: Find the differences (d)
Calculate the difference between the ranks for each pair of data. This gives you the 'd' values.
Step 3: Square the differences (d²)
Square each of the differences you calculated in step 2.
Step 4: Sum the squared differences (Σd²)
Add up all the squared differences to get your Σd² value.
Step 5: Apply the formula
Substitute your values into the Spearman's formula and calculate the result.
Worked example: Car tyre age and stopping distance
Let's work through a complete example using data about car tyre age and minimum stopping distance.
Data provided:
- 8 cars with tyre ages ranging from 10 to 32 months
- Stopping distances ranging from 42 to 60 metres
Step 1: Ranking the data For tyre age (months): 10, 11, 15, 20, 24, 28, 30, 32 Rankings become: 1, 2, 3, 4, 5, 6, 7, 8
For stopping distance (metres): 42, 47, 49, 50, 51, 53, 58, 60 Rankings become: 1, 2, 3, 4, 5, 6, 7, 8
Step 2: Calculate differences (d)
- Car A: 1 - 1 = 0
- Car B: 2 - 3 = -1
- Car C: 3 - 2 = 1
- Car D: 4 - 6 = -2
- Car E: 5 - 5 = 0
- Car F: 6 - 7 = -1
- Car G: 7 - 4 = 3
- Car H: 8 - 8 = 0
Step 3: Square the differences (d²) 0, 1, 1, 4, 0, 1, 9, 0
Step 4: Sum the squared differences Σd² = 0 + 1 + 1 + 4 + 0 + 1 + 9 + 0 = 16
Step 5: Apply the formula rs = 1 - (6 × 16)/(8 × (8² - 1)) rs = 1 - 96/(8 × 63) rs = 1 - 96/504 rs = 1 - 0.19 rs = 0.81
Interpreting the results
A Spearman's coefficient of 0.81 indicates a strong positive correlation between tyre age and stopping distance. This means that as tyres get older, the stopping distance tends to increase significantly.
Correlation strength guide:
- 0.8 to 1.0 (or -0.8 to -1.0): Very strong correlation
- 0.6 to 0.8 (or -0.6 to -0.8): Strong correlation
- 0.4 to 0.6 (or -0.4 to -0.6): Moderate correlation
- 0.2 to 0.4 (or -0.2 to -0.4): Weak correlation
- 0.0 to 0.2 (or 0.0 to -0.2): Very weak or no correlation
Common exam tips and mistakes to avoid
Essential exam techniques:
- Always show your working clearly - marks are awarded for method even if your final answer is incorrect
- You can write the ranks directly beside the values in your table to keep organised
- Sometimes exam questions provide separate columns for d and d² calculations
- Double-check your ranking - this is where most errors occur
Common mistakes to watch out for:
- Forgetting to rank tied values correctly (use the average of the ranks they would occupy)
- Mixing up which variable gets which rank
- Arithmetic errors when squaring differences or adding them up
- Not showing sufficient working in exam conditions
Memory aid for the formula: Remember "Six Sigma over N times N-squared minus one" - this helps you recall that it's 6Σd² divided by n(n²-1), then subtract the result from 1.
Remember!
- Spearman's coefficient measures correlation using rankings, not the actual data values
- The formula is rs = 1 - (6Σd²)/(n(n² - 1)) where d is the difference between ranks
- Results range from -1 to +1, with values closer to ±1 indicating stronger correlations
- Always rank your data first, then calculate differences, square them, and sum them up
- Show all working in exams - you get method marks even if the final answer is wrong