Measures of dispersion for discrete data (AQA GCSE Statistics): Revision Notes
Measures of dispersion for discrete data
What are measures of dispersion?
Measures of dispersion help us understand how spread out our data is. While averages tell us about the centre of our data, measures of dispersion reveal whether the values are clustered tightly together or scattered widely apart. The two main measures you need to know are range and interquartile range.
These measures are particularly useful because they give us a complete picture of our dataset. For example, two sets of data might have the same mean, but one could be much more spread out than the other. Understanding dispersion helps us make better comparisons and draw more accurate conclusions from our data.
Understanding both the centre (mean, median, mode) and the spread (dispersion) of data is essential for making accurate statistical interpretations. A dataset's average alone can be misleading without knowing how scattered the values are.
Understanding quartiles
Before we can calculate the interquartile range, we need to understand what quartiles are. Quartiles are special values that divide your ordered dataset into four equal parts, just like how the median divides data into two equal halves.
There are three quartiles to remember:
- Lower quartile (Q₁): The value that has 25% of the data below it
- Median (Q₂): The middle value with 50% of data on each side
- Upper quartile (Q₃): The value that has 75% of the data below it
An important fact to remember is that exactly half of all values in your dataset will lie between the lower quartile and upper quartile. This makes the interquartile range particularly useful for understanding where the "middle bulk" of your data sits.
Think of quartiles like cutting a cake into four equal slices. Each quartile represents a boundary between these slices, helping us understand how our data is distributed across different sections.
Calculating range
The range is the simplest measure of dispersion to calculate. It shows us the total spread of our data from the smallest to the largest value.
Formula:
The range is easy to work out, but it has a significant limitation: it only considers the two extreme values and ignores everything in between. This means that outliers (unusually high or low values) can make the range misleading about how spread out most of the data actually is.
The range can be heavily influenced by outliers. A single extremely high or low value can make your data appear much more spread out than it actually is, which is why the interquartile range is often more reliable.
Finding quartiles for discrete data
When working with discrete data that isn't in a frequency table, follow these systematic steps:
- Order your data from smallest to largest - this is absolutely essential
- Count the total number of values (call this n)
- Find Q₁ position: Calculate - this gives you the position of the lower quartile
- Find Q₃ position: Calculate - this gives you the position of the upper quartile
If your calculated position is a whole number, simply take that value from your ordered list. If it's a decimal (like 3.25), you'll need to find the value that lies a quarter of the way between the 3rd and 4th values in your list.
Worked Example: Finding Quartiles for Goals Scored
Let's work through finding quartiles step by step using this data: Kim's goals in 12 netball games: 13, 10, 4, 10, 7, 12, 11, 14, 14, 8, 6, 9
Step 1: Order the data 4, 6, 7, 8, 9, 10, 10, 11, 12, 13, 14, 14
Step 2: Count values values
Step 3: Find Q₁ position This means Q₁ is a quarter of the way between the 3rd and 4th values. 3rd value = 7, 4th value = 8
Step 4: Find Q₃ position
This means Q₃ is three-quarters of the way between the 9th and 10th values.
9th value = 12, 10th value = 13
Step 5: Calculate IQR
Working with frequency tables
When your data is presented in a frequency table, you must use cumulative frequency to find quartiles. This is because you need to locate the position of quartiles within the total dataset.
Steps for frequency tables:
- Create a cumulative frequency column by adding up frequencies as you go down
- Find the total frequency (this is your n value)
- Calculate quartile positions using the same formulas: and
- Use cumulative frequency to identify which data value corresponds to each position
Worked Example: Finding Quartiles from Frequency Table
For this frequency table showing number of parked cars:
| Number of cars | 0 | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|---|
| Frequency | 5 | 16 | 12 | 10 | 7 | 3 |
| Cumulative frequency | 5 | 21 | 33 | 43 | 50 | 53 |
Step 1: Total frequency
Step 2: Find Q₁ position Looking at cumulative frequency, the 13.5th value falls in the group where cumulative frequency = 21 Therefore
Step 3: Find Q₃ position
Looking at cumulative frequency, the 40.5th value falls in the group where cumulative frequency = 43
Therefore
Step 4: Calculate IQR
Key exam tips and common mistakes
Always remember: You must use cumulative frequency when finding quartiles from any frequency table. This is a crucial rule that many students forget.
Common mistake: Not ordering data first when working with discrete lists. Always arrange your values from smallest to largest before attempting to find quartiles.
Exam trap: Be careful when calculating positions like 3.25. This means you need to find a value that's one quarter of the way between two consecutive data points, not just take the 3rd or 4th value.
Formula memory: The denominators in quartile formulas follow a pattern - Q₁ uses , while Q₃ uses . Notice how the numerator is multiplied by 3 for the upper quartile.
Key Points to Remember:
- Range = largest value - smallest value (simple but affected by outliers)
- IQR = Q₃ - Q₁ (more reliable as it focuses on the middle 50% of data)
- Always order data first before finding quartiles in discrete lists
- Use cumulative frequency when working with frequency tables - this is essential
- Quartiles divide data into four equal parts, with Q₂ being the median