Estimating the median (Edexcel GCSE Statistics): Revision Notes
Estimating the median
When working with grouped data displayed in a histogram, we often need to find an estimate of the median. Unlike finding the median from a simple list of numbers, histograms show data in class intervals, so we need a special method to estimate where the median lies within these groups.
This method is necessary because histograms group data into intervals rather than showing individual values, making it impossible to identify the exact middle value directly.
Understanding the method
The median is the middle value when data is arranged in order. For grouped data, we can't identify the exact middle value, but we can estimate where it would fall within a particular class interval by using the information shown in the histogram.
The estimation process uses linear interpolation within the median class interval, assuming that values are evenly distributed throughout that interval.
The four-step method
To estimate the median from a histogram, follow these four essential steps:
Step 1: Find the total frequency (n)
Add up all the frequencies from each bar in the histogram. Remember that for histograms, the frequency of each bar is calculated by multiplying the frequency density by the class width.
Example Calculation:
For a histogram with bars having different frequency densities and widths:
- Bar 1: frequency = 2 × 1 = 2
- Bar 2: frequency = 3 × 2 = 6
- Bar 3: frequency = 5 × 3 = 15
- Bar 4: frequency = 10 × 1.4 = 14
Total frequency:
Step 2: Find the median position
The median position is at (half of the total frequency). Once you've calculated this value, identify which class interval contains this position by counting through the frequencies.
Finding the Median Position:
With :
- Median position =
- Count through cumulative frequencies: 2, then 8, then 23, etc.
- The 18.5th value falls in the class interval
Step 3: Calculate the width needed within the median bar
Use the frequency density formula to work out how far into the median bar you need to go:
Find the area of the bar up to the median value, then calculate the corresponding width.
The key insight here is that area represents frequency in a histogram. The area you need within the median bar equals the number of values you need to "count into" that bar to reach the median position.
Calculating Width in Median Bar:
- Area needed = median position - cumulative frequency before median bar
- Area needed =
- If frequency density = 3, then: width =
Step 4: Add the width to the lower class boundary
Take the lower boundary of the median class interval and add the width you calculated in step 3.
Final Calculation:
- Lower class boundary = 5
- Estimated median = hours
Worked example: Employee ages
Worked Example: Estimating Median Employee Age
Given a histogram showing ages of employees in a company with the following data:
Step 1: Find total frequency
Step 2: Find median position
Median position =
Counting cumulative frequencies, the 95th value falls in the class interval
Step 3: Calculate area needed in median bar
- Cumulative frequency before median bar =
- Area needed =
- Frequency density of median bar = 4
- Width needed =
Step 4: Find the estimate Estimated median = years
Key formulas to remember
Essential Formulas:
- Median position: (where is total frequency)
- Frequency density:
- Frequency:
- Estimated median:
Common exam tips
Exam Success Tips:
- Always check that your frequency calculations use frequency density × width for each bar
- Make sure you identify the correct class interval for the median position
- Remember that the median position might be a decimal - this is normal and expected
- The frequency density formula rearranges to:
- Double-check your cumulative frequency counting - this is where most errors occur
Key Points to Remember:
- The median estimate uses linear interpolation within the median class interval
- You must find the total frequency first by adding all individual bar frequencies
- The median position is always , where is the total frequency
- , so you can rearrange this formula as needed
- Your final answer should include appropriate units from the original data