Deciding Which Average to Use (Leaving Cert Mathematics): Revision Notes
Deciding Which Average to Use
Understanding when each average is most appropriate
In statistics, we have three main measures of central tendency: the mode, median, and mean. While all three can describe the centre of a dataset, choosing the right one depends on the nature of your data and what you want to communicate.
Each average tells us something different about our data, and understanding when to use each one is crucial for accurate statistical analysis.
Different types of data and different research questions require different approaches to measuring the "typical" value. The key is understanding what story each average tells about your dataset.
When to use the mode
The mode represents the most frequently occurring value in a dataset. This average is particularly useful when you need to identify the most common or popular choice.
Use the mode when:
- You want to know which value appears most often
- You're dealing with categorical data (like colours, sizes, or preferences)
- You need to identify the most popular option
- You're looking at data where the most common value is more important than the numerical average
For example, if you're a shoe retailer wanting to know which size to stock most of, the mode would tell you the most commonly sold shoe size.
When to use the mean
The mean is calculated by adding all values together and dividing by the number of values. It's the most familiar type of average and uses every single piece of data in your dataset.
The mean works best when:
- Your data is closely grouped together without extreme values
- You need a measure that considers all data points
- The data is roughly normally distributed
- You plan to use the result for further statistical calculations
Be cautious when your data includes outliers (extremely high or low values that differ significantly from the rest). These can distort the mean, making it unrepresentative of the typical value.
Consider this example: In a small company, the chief executive earns €12,100 per month while eleven other employees each earn €2,500 per month. The mean salary would be €3,300, but this doesn't represent what a typical employee earns.
When to use the median
The median is the middle value when all data points are arranged in numerical order. If there's an even number of values, it's the average of the two middle numbers.
The median is most suitable when:
- Your data contains outliers or extreme values
- The data is skewed (not evenly distributed)
- You want a measure that represents the "typical" middle value
- You need a measure unaffected by extremely high or low values
Using the company example above, the median salary would be €2,500, which better represents what most employees actually earn.
Comparing advantages and disadvantages of each average
Understanding the strengths and limitations of each average helps you make informed decisions:
Mode advantages:
- Simple to identify and understand
- Not influenced by extreme values or outliers
- Can be used with non-numerical data
Mode disadvantages:
- May not exist if no value repeats
- Not useful for further mathematical analysis
- Can be misleading if multiple modes exist
Median advantages:
- Unaffected by extreme values or outliers
- Easy to calculate once data is ordered
- Gives a true middle value
Median disadvantages:
- May not correspond to an actual data value
- Not always useful for further statistical analysis
- Doesn't use all available information
Mean advantages:
- Uses every piece of data in the dataset
- Easy to calculate and understand
- Very useful for further statistical analysis
- Most commonly expected measure
Mean disadvantages:
- Heavily influenced by outliers and extreme values
- May not represent a value that actually exists in the data
- Can be misleading with skewed data
Impact of outliers on statistical measures
Outliers are values that are significantly different from the majority of your data. They can dramatically affect your choice of average.
When outliers are present:
- The mean gets "pulled" towards the extreme values
- The median remains stable and unaffected
- The mode is typically unaffected unless the outlier becomes the most frequent value
Worked example: letters delivered to apartments
Worked Example: Choosing the Right Average
Problem: The number of letters delivered to 10 apartments in a block on a particular day was: 2, 0, 5, 3, 4, 0, 1, 0, 3, 15
Step 1: Calculate all three averages
Mean = (2 + 0 + 5 + 3 + 4 + 0 + 1 + 0 + 3 + 15) ÷ 10 = 33 ÷ 10 = 3.3
Mode = 0 (appears three times, more than any other value)
Median: First, order the data: 0, 0, 0, 1, 2, 3, 3, 4, 5, 15 Median = (2 + 3) ÷ 2 = 2.5
Step 2: Evaluate which average is most suitable
- The mean (3.3) has been distorted by the large number 15. Most apartments received far fewer than 3.3 letters
- The mode (0) suggests no letters were delivered, but 7 out of 10 apartments did receive some letters
- The median (2.5) represents the typical experience better, as half the apartments received more and half received less
Conclusion: The median is the most suitable average because it's unaffected by the outlier (15) and gives a better representation of the typical number of letters delivered.
Practical applications
Temperature data example:
When recording daily temperatures over a week, the mean is usually appropriate because temperature data typically doesn't contain extreme outliers, and we want to consider all recorded values.
Consumer preference surveys:
In survey data about preferred can sizes, the mode tells us the most popular choice, the median shows the middle preference, and the mean gives an overall average. For business decisions about which size to produce most, the mode would be most valuable.
Key Points to Remember:
- Choose the mode when you need to identify the most common or popular value, especially with categorical data
- Choose the mean when your data is evenly distributed without extreme outliers and you need to use all data points
- Choose the median when your data contains outliers or is skewed, and you want a measure that represents the typical middle value
- Outliers significantly affect the mean but have little impact on the median and mode
- Consider your purpose - different averages answer different questions about your data