The shape of a distribution (Edexcel GCSE Statistics): Revision Notes
The shape of a distribution
What is the shape of a distribution?
When we collect and display data using graphs like histograms, stem and leaf diagrams, or frequency polygons, the overall pattern or outline that emerges is called the shape of the distribution. Understanding this shape helps us describe important characteristics of our data and make sense of what it tells us.
The shape of a distribution is one of the most fundamental concepts in statistics. It provides immediate visual insight into how your data behaves and can guide you towards appropriate analytical methods.
There are three main shapes that distributions can take: symmetrical, positively skewed, and negatively skewed. Each shape tells us something different about how the data values are spread out.
The three types of distribution shapes
Symmetrical distribution
A symmetrical distribution is perfectly balanced around its centre. If you were to fold the graph in half along its middle line, both sides would match exactly. This means:
- The data is evenly spread on both sides of the centre
- There's no particular lean towards higher or lower values
- Most data points cluster around the middle value
- Both ends (or "tails") of the distribution are equal in length
Positive skew (right skew)
In a positively skewed distribution, there's an extended tail stretching towards the higher (positive) values. This creates several key characteristics:
- Most of the data points are concentrated at the lower end of the range
- The distribution is "pulled out" or stretched in the positive direction
- The longer tail points towards the right side of the graph
- There are fewer data points at the higher values, but they extend further out
Negative skew (left skew)
A negatively skewed distribution has the opposite pattern to positive skew, with a longer tail extending towards the lower (negative) values:
- Most data points are bunched up at the higher end of the range
- The distribution stretches out in the negative direction
- The longer tail points towards the left side of the graph
- There are fewer data points at the lower values, but they spread further down
Remember the direction rule: The skew is always named after the direction of the longer tail, not where most of the data sits. Positive skew = longer tail pointing right, negative skew = longer tail pointing left.
Understanding tails in distributions
Every distribution has two tails - these are the parts of the distribution that extend furthest away from the centre (mean). Think of them as the "ends" of your data:
Understanding Tails
- The positive tail is on the right side of the distribution
- The negative tail is on the left side of the distribution
- The length of these tails determines whether your distribution is skewed or symmetrical
The concept of tails is crucial because it's the relative length of these tails that determines the type of skew in your distribution.
Worked example: pulse rate analysis
Worked Example: Analysing Pulse Rate Distribution
Let's work through a practical example using pulse rate data from 10 people measured before and after exercise.
The data:
- Before exercise: 69, 68, 74, 76, 75, 90, 82, 81, 95, 87
- After exercise: 88, 89, 97, 95, 92, 102, 104, 86, 111, 96
Step 1: Creating back-to-back stem and leaf diagrams
To compare the two sets of data effectively, we can create back-to-back stem and leaf diagrams:
Before exercise:
- Key: 6|7 = 67 beats per minute
- 6: 8, 9
- 7: 4, 5, 6
- 8: 1, 2, 7
- 9: 0, 5
After exercise:
- Key: 8|6 = 86 beats per minute
- 8: 6, 8, 9
- 9: 2, 5, 6, 7
- 10: 2, 4
- 11: 1
Step 2: Commenting on the shape
Looking at our stem and leaf diagrams:
- Before exercise: The distribution appears roughly symmetrical, with data spread fairly evenly around the middle values
- After exercise: This shows positive skew, with most values clustered in the 80s and 90s, but with a tail extending up to 111
Step 3: Creating frequency polygons
When we plot frequency polygons for grouped data (using intervals like 60-70, 70-80, etc.), we plot the frequency at the midpoint of each interval. For our pulse rate data:
- The "before exercise" polygon would show a more balanced, symmetrical shape
- The "after exercise" polygon would clearly show positive skew with its tail extending towards higher values
Identifying skew from graphs
When examining any graph showing data distribution, look for these telltale signs:
- Symmetrical: Both sides mirror each other perfectly
- Positive skew: The right side has a longer, thinner tail
- Negative skew: The left side has a longer, thinner tail
Key Identification Strategy
Always ask yourself: "Which tail is longer?" The answer immediately tells you the type of skew. Don't get confused by looking at where most of the data clusters - focus on the tail length!
Common exam tips
Understanding distribution shapes is a frequent exam topic. Here are the most important strategies:
Essential Exam Success Tips
- Always identify the direction of the longer tail - this determines the type of skew
- Look at where most of the data clusters - in positive skew, most data is at the lower end
- Remember the tail direction rule - positive skew has a tail pointing right (positive direction)
- Practice describing what the skew means - don't just identify it, explain what it tells us about the data
Being able to quickly identify and explain distribution shapes will significantly improve your performance in statistical analysis questions.
Key Points to Remember:
- The shape of a distribution describes the overall pattern formed by your data when displayed graphically
- Symmetrical distributions are perfectly balanced with equal tails on both sides
- Positive skew has a longer tail extending towards higher values, with most data at the lower end
- Negative skew has a longer tail extending towards lower values, with most data at the upper end
- The tails are the parts of the distribution furthest from the centre - identifying which tail is longer helps you determine the type of skew