Probability & Significance (AQA A-Level Psychology): Revision Notes
Probability & Significance
Introduction to statistical testing
Statistical tests are essential tools in psychological research that help researchers determine whether their findings are meaningful. Every statistical test produces a calculated value - a numerical result that helps researchers decide if they have discovered something statistically important. This calculated value determines whether researchers should accept their alternative hypothesis or stick with the null hypothesis.
Understanding how these tests work requires grasping two related concepts: probability and significance. These concepts work together to help researchers make informed decisions about their data.
The relationship between probability and significance is fundamental to statistical testing. While probability tells us how likely something is to occur, significance helps us decide whether that probability is low enough to conclude we've found a meaningful effect.
Key definitions
Probability refers to how likely it is that a particular event will happen. This is measured on a scale where 0 means something is statistically impossible and 1 means it is statistically certain to occur.
Significance is a statistical concept that tells researchers how confident they can be that a difference or relationship actually exists in their data. When researchers find a significant result, it means they can confidently reject the null hypothesis.
Critical value represents the numerical cut-off point that separates acceptance from rejection of the null hypothesis. This boundary helps researchers decide whether their calculated value is significant enough to draw conclusions.
Type I error occurs when researchers incorrectly reject a true null hypothesis. This is also called a false positive because researchers claim to have found an effect when none actually exists.
Type II error happens when researchers fail to reject a false null hypothesis. This is known as a false negative because researchers miss a real effect that does exist.
Mastering these definitions is crucial for understanding statistical testing. Each term builds on the others, so ensure you understand the relationship between calculated values, critical values, and the different types of errors before proceeding.
Understanding hypotheses
Researchers start investigations by forming a hypothesis, which can be directional or non-directional depending on how confident they are about the expected outcome. For example, a directional hypothesis might predict that "participants drinking 300ml of SpeedUpp will say more words in five minutes than participants drinking 300ml of water."
This type of hypothesis is called the alternative hypothesis () because it proposes an alternative to the null hypothesis (). The null hypothesis always states that there is 'no difference' between conditions.
Worked Example: Forming Hypotheses
Alternative Hypothesis (): "Participants drinking 300ml of SpeedUpp will say more words in five minutes than participants drinking 300ml of water."
Null Hypothesis (): "There is no difference in the number of words spoken in five minutes between participants who drink SpeedUpp and those who drink water."
Notice how the null hypothesis always predicts no difference or no effect, while the alternative hypothesis predicts a specific outcome.
The purpose of statistical testing is to determine which hypothesis is more likely to be correct, helping researchers decide whether to accept or reject the null hypothesis.
Significance levels and probability
Statistical tests operate on probability rather than absolute certainty. This is why researchers use a significance level - the point where they can claim to have discovered a noteworthy difference or correlation in their data.
The standard significance level in psychology is 0.05 (or 5%). This is properly written as , where p represents probability. This means that if researchers find a significant result, there is still a 5% or less chance that this result occurred when no real effect exists in the population.
The 5% significance level represents a balance between being too cautious (missing real effects) and being too liberal (finding false effects). This conventional level has been adopted across psychology as it provides reasonable protection against both types of errors while still allowing researchers to detect meaningful effects.
This 5% level exists because psychologists cannot be 100% certain about results since they haven't tested every member of a population under all possible circumstances. Instead, they have agreed on this conventional probability level as an acceptable risk that results might have occurred by chance.
Using statistical tables and critical values
After calculating a statistical test, researchers must compare their calculated value with a critical value found in statistical tables. These tables, created by statisticians, help determine whether results are significant enough to reject the null hypothesis.
Different statistical tests have different rules. For some tests, the calculated value must equal or exceed the critical value to be significant. For others, the calculated value must be equal to or less than the critical value.
Criteria for using statistical tables
Researchers need three pieces of information to find the correct critical value:
One-tailed or two-tailed test: Use a one-tailed test when the hypothesis is directional and a two-tailed test for non-directional hypotheses. Two-tailed tests are more conservative predictions, so probability levels double when using them.
Number of participants: This usually appears as the N value in statistical tables. Some tests use degrees of freedom (df) calculations instead of participant numbers.
Significance level: As discussed, 0.05 is the standard level in psychological research, though researchers may sometimes use more stringent levels.
Lower significance levels
Sometimes researchers use more stringent significance levels like 0.01 (1%). This might occur in studies with potential human costs, such as drug trials, or in 'one-off' studies that cannot be practically repeated. When there is a large difference between calculated and critical values in the preferred direction, researchers often check these more stringent levels because lower p values indicate more statistically robust results.
Type I and Type II errors explained
Because researchers can never achieve 100% certainty in their findings, there is always a possibility (usually up to 5%) that they might accept the wrong hypothesis.
Type I errors
A Type I error occurs when researchers reject the null hypothesis and accept the alternative hypothesis, but the null hypothesis was actually correct. This creates an optimistic error or false positive - researchers claim they found a significant difference or correlation when none actually exists.
Type I Error (False Positive)
- What happens: Researchers reject a true null hypothesis
- Result: They claim to find an effect that doesn't actually exist
- Risk: Increased with lenient significance levels (like 0.1 or 10%)
- Remember: "Optimistic error" - researchers are overly optimistic about their findings
Type II errors
A Type II error is the opposite situation. Here, researchers accept the null hypothesis when they should have accepted the alternative hypothesis because the alternative was actually correct. This creates a pessimistic error or false negative - researchers miss a real effect.
Type II Error (False Negative)
- What happens: Researchers accept a false null hypothesis
- Result: They miss a real effect that actually exists
- Risk: Increased with stringent significance levels (like 0.01 or 1%)
- Remember: "Pessimistic error" - researchers are overly pessimistic and miss real effects
Balancing error risks
Researchers are more likely to make Type I errors if their significance level is too lenient (such as 0.1 or 10%). Conversely, Type II errors become more likely when significance levels are too stringent (such as 0.01 or 1%) because potentially meaningful values might be missed.
Psychologists favour the 5% significance level because it provides the best balance between risking Type I and Type II errors.
Practical understanding
It's important to understand that statistical testing deals with probabilities, not certainties. When researchers use the 5% significance level, they accept that even significant results have up to a 5% chance of occurring when the null hypothesis is true (meaning no real effect exists in the population).
However, it would be incorrect to claim "95% certainty that the result did not occur by chance" because this phrase contains a contradiction - we can only be 100% certain of anything, and statistical testing specifically deals with probabilities rather than certainties.
The distinction between probability and certainty is crucial in statistical interpretation. Statistical tests provide evidence for or against hypotheses based on probability calculations, but they never provide absolute proof. This is why we talk about "statistical significance" rather than "statistical certainty."
Summary
Key Points to Remember:
- Statistical tests produce calculated values that must be compared with critical values to determine significance
- The standard significance level in psychology is (5%)
- Type I errors are false positives (rejecting true null hypothesis), while Type II errors are false negatives (accepting false null hypothesis)
- Statistical testing works on probability, not certainty - there's always a chance results occurred by chance
- The 5% significance level balances the risk of both types of errors effectively