Measures of dispersion for discrete data (Edexcel GCSE Statistics): Revision Notes
Measures of dispersion for discrete data
What are measures of dispersion?
Measures of dispersion help us understand how spread out our data values are. Unlike measures of central tendency (like the mean or median) that tell us about the "typical" value, dispersion measures tell us how much the data values vary from each other.
The two main measures of dispersion for discrete data are:
- Range - shows the total spread of the data
- Interquartile range (IQR) - shows the spread of the middle 50% of the data
While both measures tell us about spread, they focus on different aspects of the data distribution. Understanding when to use each one is crucial for proper data analysis.
Understanding quartiles
Before we can calculate the interquartile range, we need to understand quartiles. Think of quartiles as values that split your data set into four equal groups, just like how quarters split something into four parts.
The three quartiles are:
- Q₁ (Lower quartile) - 25% of values lie below this point
- Q₂ (Median) - 50% of values lie below this point
- Q₃ (Upper quartile) - 75% of values lie below this point
Finding quartile positions
When your data is arranged in ascending order, you can find the positions using these formulas:
Where = the total number of data values.
These formulas only work when your data is arranged in ascending order first. Never attempt to find quartiles from unsorted data.
Calculating the range
The range is the simplest measure of dispersion to calculate:
This gives you the total spread of your data, but it can be affected by extreme values (outliers).
The range gives you a quick sense of the total spread, but be aware that a single outlier can make the range much larger than it would be otherwise, potentially giving a misleading impression of how spread out most of your data actually is.
Calculating the interquartile range (IQR)
The interquartile range focuses on the middle 50% of your data, making it less affected by outliers:
This tells you how spread out the central portion of your data is.
Worked Example: Finding Quartiles from Raw Data
Let's work through Kim's netball goals data: 13, 10, 4, 10, 7, 12, 11, 14, 14, 8, 6, 9
Step 1: Arrange the data in ascending order 4, 6, 7, 8, 9, 10, 10, 11, 12, 13, 14, 14
Step 2: Count the number of values values
Step 3: Find Q₁ position
This means Q₁ is 1/4 of the way between the 3rd and 4th values.
- 3rd value = 7
- 4th value = 8
Step 4: Find Q₃ position
This means Q₃ is 3/4 of the way between the 9th and 10th values.
- 9th value = 12
- 10th value = 13
Step 5: Calculate the IQR
Using cumulative frequency tables
When working with frequency tables, you must use cumulative frequency to find quartiles. This is because you need to know how many values come before each data point.
Never use the ordinary frequency column when finding quartiles from frequency tables. You must always work with cumulative frequency to determine the correct positions in your ordered data set.
Example with cumulative frequency
For a data set with 53 total values:
Look at your cumulative frequency column to find which data values correspond to these positions.
Key exam tips
Essential Exam Strategies:
-
Always arrange data in ascending order first - this is essential for finding quartiles correctly.
-
When the position isn't a whole number, you need to interpolate (find a value between two data points).
-
Use cumulative frequency for frequency tables - never use the ordinary frequency column when finding quartiles.
-
Check your working - should be smaller than , and both should be reasonable values within your data range.
-
Show all steps clearly - examiners want to see your method, even if you make a small calculation error.
Advantages and disadvantages
Range:
- Advantage: Quick and easy to calculate
- Disadvantage: Affected by extreme values (outliers)
Interquartile Range:
- Advantage: Not affected by outliers, gives information about the middle 50% of data
- Disadvantage: Ignores the extreme values completely, so doesn't show the full picture
The choice between range and IQR depends on your data and what you want to emphasise. If you have outliers, IQR gives a better sense of the typical spread. If you want to show the full extent of variation, range is more appropriate.
Key Points to Remember:
- Range and IQR both measure how spread out your data is
- Quartiles divide your data into four equal parts
- Always arrange data in ascending order before finding quartiles
- position, position
- and focuses on the middle 50% of your data
- Use cumulative frequency when working with frequency tables