PMCC & Non-linear Regression (AQA A-Level Mathematics): Revision Notes
2.5.1 PMCC & Non-linear Regression
Pearson's Product-Moment Correlation Coefficient (PMCC)
PMCC is a measure of the strength and direction of the linear relationship between two continuous variables. It ranges from -1 to 1:
- +1: Perfect positive linear correlation.
- -1: Perfect negative linear correlation.
- 0: No linear correlation.
Formula
Where
- is the PMCC.
- and are the variables.
- is the number of data points.
Interpretation
- Positive : As increases, also increases.
- Negative : As increases, decreases.
- Magnitude: The closer is to 1, the stronger the linear relationship.
Example: Suppose you have the following data on the number of hours studied ( ) and exam scores ( ):
| Hours Studied ( ) | Exam Score ( ) |
|---|---|
To calculate , first compute the necessary sums:
Substitute into the PMCC formula:
Explanation: An value of 0.5 indicates a moderate positive linear relationship between hours studied and exam scores.
Non-linear Regression
Non-linear regression is used when the relationship between the variables is not linear. Unlike linear regression, which fits a straight line to the data, non-linear regression fits a curve.
Common Types of Non-linear Relationships
- Quadratic:
- Exponential:
- Logarithmic:
- Power Law:
Example: Non-linear Regression: Quadratic Relationship Question: Consider a dataset where the relationship between (number of hours studied) and (exam score) is better described by a quadratic relationship:
| Hours Studied ( ) | Exam Score ( ) |
|---|---|
Step 1: Fit a Quadratic Model: The quadratic model takes the form:
Using statistical software or manual methods (like least squares fitting), you determine the coefficients , , and . For simplicity, assume the fitted model is:
Step 2: Evaluate the Fit:
- Plot the data and the curve: Plot against and overlay the fitted quadratic curve. Visually check how well the curve fits the data points.
- Calculate the residuals: The residuals (differences between observed and predicted values) should be small and randomly distributed.
Step 3: Interpretation:
- Turning Point: The quadratic model suggests that exam scores increase with study time up to a certain point (7 hours), after which more studying results in lower scores. This might indicate over-studying or fatigue.
- Better Fit: The quadratic curve likely fits the data better than a straight line, as it captures the peak and subsequent decline in scores.
Summary
- PMCC: Measures the strength and direction of the linear relationship between two variables. An value close to 1 or -1 indicates a strong linear relationship, while an close to 0 indicates no linear relationship.
- Non-linear Regression: Used when the relationship between variables isn't linear. Different models (quadratic, exponential, etc.) can be fitted to data to better capture the underlying relationship. Understanding when to use PMCC versus non-linear regression is crucial for accurately analysing relationships between variables and making informed predictions based on data.