Summarising Data (Grade 10 NSC Matric Mathematical Literacy): Revision Notes
Summarising Data
Introduction to summarising data
When we collect and organise data, it is often impractical to mention every single piece of information in a report. Instead, we summarise data by describing the entire dataset using just a few key numbers. This process makes data analysis much more manageable and helps us identify important patterns and trends.
Data summarisation is essential in statistics because it allows us to understand large datasets quickly by focusing on the most important characteristics. Rather than dealing with hundreds or thousands of individual data points, we can use summary statistics to tell the story of our data.
Data can be summarised using two main types of measures:
- Measures of central tendency - these show the central or typical position of data
- Measures of spread - these describe how the data is distributed or dispersed
Measures of central tendency and measures of spread
Measures of central tendency are single values that attempt to show the central position of a dataset. Measures of spread describe how the data values are spread out or dispersed around the centre.
There are three main types of measures of central tendency: mean, mode, and median.
Think of measures of central tendency as finding the "typical" value in your dataset - like finding the typical height of students in your class or the typical test score. Measures of spread tell you whether your data points are clustered close together or scattered widely apart.
Mean
The mean is the most commonly used measure of central tendency. It is also known as the average. The mean is calculated by adding all the values together and dividing by the number of values in the dataset.
Formula:
For example, if you have the numbers 2, 6, 8, 10, 12, 14, 18, the mean calculation would be:
Worked Example: Finding the Mean with Simple Data
Question: Find the mean of the numbers 4, 6, 7, 3, 4, 8, 4, 2, 9.
Solution: To find the mean, we add all the values and divide by how many values there are:
- Sum = 4 + 6 + 7 + 3 + 4 + 8 + 4 + 2 + 9 = 47
- Number of values = 9
- Mean = 47 ÷ 9 = 5.2
Worked Example: Finding the Mean with Frequency Table Data
Question: The frequency table below shows test marks achieved by 20 learners. Calculate the mean mark.
| Mark | 4 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|
| Frequency | 2 | 4 | 3 | 6 | 3 | 2 |
Solution: When working with frequency tables, we can use multiplication as a shortcut since some marks are repeated.
Total of marks = (4 × 2) + (6 × 4) + (7 × 3) + (8 × 6) + (9 × 3) + (10 × 2) = 8 + 24 + 21 + 48 + 27 + 20 = 148
Total number of learners = 2 + 4 + 3 + 6 + 3 + 2 = 20
Mean = 148 ÷ 20 = 7.4
Median
When data is arranged in ascending order (smallest to largest), the middle value in the set is called the median. When data is arranged in descending order (largest to smallest), it goes from biggest to smallest value.
The median helps us find the central position of our data and is particularly useful when dealing with extreme values that might affect the mean.
Critical Step: You MUST arrange the data in ascending order before finding the median. This is the most common mistake students make when calculating the median. Never try to find the median from unordered data.
We need to consider two cases when finding the median:
- When there is an odd number of data values
- When there is an even number of data values
Worked Example: Finding the Median with an Odd Number of Values
Question: Find the median of the numbers: 4, 6, 7, 4, 3, 4, 8, 2, 9, 7, 2.
Solution: First, we must arrange the numbers in ascending order: 2, 2, 3, 4, 4, 4, 6, 7, 7, 8, 9
The arrow indicates the middle position. There are 5 numbers on each side of the number 4.
Therefore, 4 is the median of the set of numbers.
Worked Example: Finding the Median with an Even Number of Values
Question: Find the median of the numbers: 4, 6, 4, 7, 2, 3, 8, 9, 7, 4.
Solution: First, arrange the numbers in ascending order: 2, 3, 4, 4, 4, 6, 7, 7, 8, 9
The arrow indicates the middle position. There is no single number in this exact position, so we take the average of the two middle numbers (4 and 6).
Therefore, the median = (4 + 6) ÷ 2 = 5
Mode
The mode is the data value that appears most often in a set of data. No calculation is needed to find the mode - you simply identify the value that appears most frequently.
Key Points About the Mode:
- If no number is repeated, then there is no mode for the dataset
- There can be more than one mode if multiple values appear with the same highest frequency
- For grouped data, we use the modal class - this is the group or class interval that has the highest frequency
Worked Example: Finding the Mode
Question 1: Find the mode of the set of numbers: 4, 6, 7, 6, 3, 4, 8, 4, 2, 9.
Solution: Looking at each number:
- 4 appears 3 times
- 6 appears 2 times
- 7 appears 1 time
- 3 appears 1 time
- 8 appears 1 time
- 2 appears 1 time
- 9 appears 1 time
The value 4 appears most often (3 times), therefore the mode is 4.
Question 2: Funeka records the colours of schoolbags of everyone arriving at school. She writes down: 27 blue, 16 red, 43 white, 7 black, 16 green. What is the modal colour?
Solution: The modal colour is white (because 43 is the highest frequency).
Range
The range is a measure of spread because it tells you how spread out the data values are. It shows the difference between the largest value and the smallest value in a dataset.
Formula:
The range gives us a simple way to understand the variability in our data. A small range indicates that data values are close together, while a large range shows that data values are spread far apart.
Worked Example: Finding the Range
Question: Find the range of the numbers 3, 7, 8, 5, 4, 10.
Solution:
- The lowest value is 3
- The highest value is 10
- Range = 10 - 3 = 7
Important note: Remember to subtract the numbers, not just state them as "3 - 10" or "10 - 3".
Common Exam Mistakes to Avoid:
- Always arrange data in ascending order before finding the median
- For the mean with frequency tables, multiply each value by its frequency, then divide the total by the sum of all frequencies
- The mode requires no calculation - just identify the most frequent value
- For grouped data, find the modal class (the interval with highest frequency)
- The range must be calculated as a subtraction, giving a single number
- Show your working clearly in exam questions
Key Points to Remember:
- Mean = Add all values ÷ Number of values (most common measure of central tendency)
- Median = Middle value when data is arranged in order (useful when there are extreme values)
- Mode = Most frequently occurring value (no calculation needed)
- Range = Highest value - Lowest value (simple measure of spread)
- Always arrange data in order before finding the median, and show all calculation steps clearly