Regression line (Edexcel GCSE Statistics): Revision Notes
Regression line
What is a regression line?
A regression line is another name for the line of best fit. When you're working with scatter diagrams, you might be given the equation of this line and asked to draw it on the graph. The regression line helps us understand the relationship between two variables and make predictions about one variable based on the other.
The equation of a regression line follows the standard format y = mx + c, where:
- m is the gradient (slope) of the line
- c is the y-intercept (where the line crosses the y-axis)
Drawing a regression line from an equation
When you have the equation of a regression line, you can plot it by finding coordinates of points that lie on the line. The best approach is to:
- Choose two different x-values that are easy to work with
- Substitute these values into the equation to find the corresponding y-values
- Plot these two points on your scatter diagram
- Draw a straight line through both points
Top tip: Choose x-values towards the lower and upper ends of your data range, as this will give you a more accurate line across the entire diagram.
Step-by-step worked example
Let's work through an example where the regression line equation is y = 1.05x - 3, representing the relationship between maths test scores (x-axis) and science test scores (y-axis).
Finding points on the line
When x = 30: y = 1.05 × 30 - 3 y = 31.5 - 3 y = 28.5
So our first point is (30, 28.5)
When x = 80: y = 1.05 × 80 - 3 y = 84 - 3 y = 81
So our second point is (80, 81)
Now you can plot these points and draw a straight line through them.
Interpreting the y-intercept
The y-intercept is the value where the regression line crosses the y-axis (when x = 0). In our equation y = 1.05x - 3, the y-intercept is -3.
In context, this means that a student who scored 0% on the maths test would be predicted to score -3% on the science test. However, this doesn't make sense in real life - you can't score negative marks! This shows us that the regression line may not be reliable for predictions outside the range of the original data.
Key point: Always consider whether your y-intercept makes sense in the real-world context of your problem.
Interpreting the gradient
The gradient tells us the rate of change between the two variables. In our equation y = 1.05x - 3, the gradient is 1.05.
This means that for every extra 1% scored on the maths test, a student would be expected to score 1.05% more on the science test. The gradient shows us how much y increases when x increases by 1 unit.
Remember:
- A positive gradient means both variables increase together
- A negative gradient means as one variable increases, the other decreases
Key formulas and equations
Standard form of regression line: y = mx + c
Where:
- y = dependent variable (usually on vertical axis)
- x = independent variable (usually on horizontal axis)
- m = gradient (rate of change)
- c = y-intercept (value when x = 0)
Exam tips and common traps
- Always show your working when substituting values into the equation
- Check your arithmetic carefully - small calculation errors can lose you marks
- Consider the context when interpreting the y-intercept - does it make sense?
- Don't extrapolate too far beyond your data range - predictions become less reliable
- Read the axes labels carefully to understand what each variable represents
- Plot points accurately and use a ruler to draw your regression line
Remember!
• A regression line is the same as a line of best fit, with equation y = mx + c • Find points on the line by substituting x-values into the equation • The y-intercept (c) shows where the line crosses the y-axis, but may not be meaningful in context • The gradient (m) shows the rate of change - how much y changes for each unit increase in x • Always consider whether your interpretations make sense in the real-world context of the problem