Box plots (Edexcel GCSE Statistics): Revision Notes
Box plots
What are box plots?
Box plots are a brilliant way to display statistical information about a dataset. They show us five key pieces of information all in one neat diagram: the minimum value, maximum value, lower quartile, upper quartile, and the median. This is called the five-number summary.
When you see a box plot, you're getting a quick visual snapshot of how the data is spread out and where most of the values lie. They're particularly useful for comparing different datasets side by side.
Box plots provide a complete statistical overview in a single diagram, making them invaluable tools for data analysis and comparison. They condense complex datasets into five essential values that tell the whole story of your data's distribution.
Understanding the five-number summary
Every box plot displays exactly five important values:
- Minimum value: The smallest number in your dataset
- Lower quartile (Q₁): The value that 25% of your data falls below
- Median (Q₂): The middle value when all data is arranged in order
- Upper quartile (Q₃): The value that 75% of your data falls below
- Maximum value: The largest number in your dataset
The key thing to remember is that quartiles divide your data into four equal parts. The lower quartile (Q₁) marks the point where the bottom 25% of values end, whilst the upper quartile (Q₃) marks where the top 25% of values begin.
Understanding Quartiles
Think of quartiles as dividing lines that split your data into four equal groups:
- Bottom 25% (below Q₁)
- Second 25% (between Q₁ and median)
- Third 25% (between median and Q₃)
- Top 25% (above Q₃)
How to read a box plot
Box plots are always drawn on graph paper with a scale. The "box" part shows the middle 50% of your data - this is called the interquartile range. The "whiskers" (the lines extending from the box) show you the minimum and maximum values.
Practical Example: Bus Passenger Numbers
If you have a box plot showing bus passenger numbers, and you can see that Q₁ = 140 and Q₃ = 220, this tells you that:
- 25% of bus routes had 140 passengers or fewer
- 25% of routes had 220 passengers or more
- The interquartile range = 220 - 140 = 80 passengers
This means the middle 50% of bus routes had between 140 and 220 passengers.
Step-by-step worked example
Worked Example: Creating a Box Plot
Data: Ages of people in a club: 10, 11, 12, 17, 18, 18, 19, 23, 29, 35, 36, 41, 41, 48
Step 1: Check if data is in order (it already is in this case - if not, arrange from smallest to largest)
Step 2: Identify the minimum and maximum values
- Minimum = 10 years
- Maximum = 48 years
Step 3: Count the total number of values
- values
Step 4: Find the median position using the formula
- Position =
- This means the median is halfway between the 7th and 8th values
- 7th value = 18, 8th value = 19
- Median = years
Step 5: Find the lower quartile (Q₁) position using
- Position = , so we use the 4th value
- Q₁ = 17 years
Step 6: Find the upper quartile (Q₃) position using
- Position = , so we use the 12th value
- Q₃ = 41 years
Step 7: Draw your box plot on graph paper with an appropriate scale, marking all five values
Key formulas to remember
For finding positions in your ordered dataset:
Essential Position Formulas
- Median position:
- Lower quartile position:
- Upper quartile position:
Where = the total number of values in your dataset.
Exam tips and common pitfalls
Critical Points for Success
Always remember: Box plots must be drawn on graph paper with a clear scale. The scale should start from a sensible number (often 0) and go up to cover your maximum value with some room to spare.
Common mistake: Students often forget to put their data in order first - this is essential before finding quartiles!
Exam tip: When the position calculation gives you a decimal (like 7.5), you need to find the value halfway between the two adjacent whole number positions.
Watch out: The interquartile range is Q₃ - Q₁, not Q₃ + Q₁. This shows the spread of the middle 50% of your data.
Key Points to Remember:
- Box plots show the five-number summary: minimum, Q₁, median, Q₃, and maximum
- Quartiles divide your data into four equal parts of 25% each
- Always arrange data in order before calculating quartiles
- Use the formulas for median, for Q₁, and for Q₃
- Box plots must always include a scale and be drawn on graph paper