Measures of dispersion for discrete data Revision Notes for Edexcel GCSE Statistics

Measures of dispersion for discrete data

What are measures of dispersion?

Measures of dispersion help us understand how spread out our data values are. Unlike measures of central tendency (like the mean or median) that tell us about the "typical" value, dispersion measures tell us how much the data values vary from each other.

The two main measures of dispersion for discrete data are:

Range - shows the total spread of the data
Interquartile range (IQR) - shows the spread of the middle 50% of the data

infoNote

While both measures tell us about spread, they focus on different aspects of the data distribution. Understanding when to use each one is crucial for proper data analysis.

Understanding quartiles

Before we can calculate the interquartile range, we need to understand quartiles. Think of quartiles as values that split your data set into four equal groups, just like how quarters split something into four parts.

The three quartiles are:

Q₁ (Lower quartile) - 25% of values lie below this point
Q₂ (Median) - 50% of values lie below this point
Q₃ (Upper quartile) - 75% of values lie below this point

Finding quartile positions

When your data is arranged in ascending order, you can find the positions using these formulas:

$Q_1 \text{ position} = \frac{n + 1}{4}$

$Q_3 \text{ position} = \frac{3(n + 1)}{4}$

Where $n$ = the total number of data values.

chatImportant

These formulas only work when your data is arranged in ascending order first. Never attempt to find quartiles from unsorted data.

Calculating the range

The range is the simplest measure of dispersion to calculate:

$\text{Range} = \text{Largest value} - \text{Smallest value}$

This gives you the total spread of your data, but it can be affected by extreme values (outliers).

infoNote

The range gives you a quick sense of the total spread, but be aware that a single outlier can make the range much larger than it would be otherwise, potentially giving a misleading impression of how spread out most of your data actually is.

Calculating the interquartile range (IQR)

The interquartile range focuses on the middle 50% of your data, making it less affected by outliers:

$\text{IQR} = Q_3 - Q_1$

This tells you how spread out the central portion of your data is.

lightbulbExample

Worked Example: Finding Quartiles from Raw Data

Let's work through Kim's netball goals data: 13, 10, 4, 10, 7, 12, 11, 14, 14, 8, 6, 9

Step 1: Arrange the data in ascending order 4, 6, 7, 8, 9, 10, 10, 11, 12, 13, 14, 14

Step 2: Count the number of values $n = 12$ values

Step 3: Find Q₁ position $Q_1 \text{ position} = \frac{12 + 1}{4} = \frac{13}{4} = 3.25$

This means Q₁ is 1/4 of the way between the 3rd and 4th values.

3rd value = 7
4th value = 8
$Q_1 = 7.25$

Step 4: Find Q₃ position $Q_3 \text{ position} = \frac{3(12 + 1)}{4} = \frac{39}{4} = 9.75$

This means Q₃ is 3/4 of the way between the 9th and 10th values.

9th value = 12
10th value = 13
$Q_3 = 12.75$

Step 5: Calculate the IQR $\text{IQR} = Q_3 - Q_1 = 12.75 - 7.25 = 5.5$

Using cumulative frequency tables

When working with frequency tables, you must use cumulative frequency to find quartiles. This is because you need to know how many values come before each data point.

chatImportant

Never use the ordinary frequency column when finding quartiles from frequency tables. You must always work with cumulative frequency to determine the correct positions in your ordered data set.

Example with cumulative frequency

For a data set with 53 total values:

$Q_1 \text{ position} = \frac{53 + 1}{4} = 13.5\text{th value}$
$Q_3 \text{ position} = \frac{3(53 + 1)}{4} = 40.5\text{th value}$

Look at your cumulative frequency column to find which data values correspond to these positions.

Key exam tips

chatImportant

Essential Exam Strategies:

Always arrange data in ascending order first - this is essential for finding quartiles correctly.
When the position isn't a whole number, you need to interpolate (find a value between two data points).
Use cumulative frequency for frequency tables - never use the ordinary frequency column when finding quartiles.
Check your working - $Q_1$ should be smaller than $Q_3$ , and both should be reasonable values within your data range.
Show all steps clearly - examiners want to see your method, even if you make a small calculation error.

Advantages and disadvantages

Range:

Advantage: Quick and easy to calculate
Disadvantage: Affected by extreme values (outliers)

Interquartile Range:

Advantage: Not affected by outliers, gives information about the middle 50% of data
Disadvantage: Ignores the extreme values completely, so doesn't show the full picture

infoNote

The choice between range and IQR depends on your data and what you want to emphasise. If you have outliers, IQR gives a better sense of the typical spread. If you want to show the full extent of variation, range is more appropriate.

bookmarkSummary

Key Points to Remember:

Range and IQR both measure how spread out your data is
Quartiles divide your data into four equal parts
Always arrange data in ascending order before finding quartiles
$Q_1 = \frac{(n+1)}{4}$ position, $Q_3 = \frac{3(n+1)}{4}$ position
$\text{IQR} = Q_3 - Q_1$ and focuses on the middle 50% of your data
Use cumulative frequency when working with frequency tables

Measures of dispersion for discrete data (Edexcel GCSE Statistics): Revision Notes