Choosing a Statistical Test Revision Notes for AQA A-Level Psychology

Choosing a Statistical Test

Statistical testing helps researchers determine whether observed differences or relationships in their data are genuine findings or merely due to chance. While descriptive statistics provide useful summaries of data through measures of central tendency and dispersion, they cannot tell us if findings are statistically meaningful. This is where inferential statistical tests become essential.

Purpose of statistical testing

Statistical tests are used to analyse whether differences or correlations found in research are statistically significant - meaning they are unlikely to have occurred by chance alone. The results help researchers decide whether to accept or reject the null hypothesis, which typically states that no real difference or relationship exists between variables.

infoNote

Statistical significance doesn't prove that a finding is practically important or meaningful - it simply indicates that the result is unlikely to be due to random chance alone.

Three key factors for test selection

When selecting an appropriate statistical test, researchers must consider three essential factors that will guide their decision-making process:

chatImportant

The Three Essential Factors:

Whether they are looking for a difference or correlation
The experimental design being used (if testing for differences)
The level of measurement of the data

All three factors must be carefully considered to select the most appropriate statistical test for your research question.

Factor 1: Difference or correlation?

The first consideration relates to the research aim. Researchers typically investigate either:

Differences between groups or conditions (e.g., comparing memory performance between two age groups)
Correlations or associations between variables (e.g., examining the relationship between stress levels and academic performance)

This distinction should be clear from the research hypothesis. Note that "correlation" in this context includes both correlational analyses and investigations examining associations between categorical variables.

infoNote

If your research question asks "Is there a difference between..." or "Do groups differ in...", you're testing for differences. If it asks "Is there a relationship between..." or "Are variables associated...", you're testing for correlation.

Factor 2: Experimental design

This factor only applies when testing for differences. Researchers must identify whether their study uses:

Related designs:

Repeated measures: Same participants take part in all conditions
Matched pairs: Different participants in each condition who have been matched on important variables

Unrelated design:

Independent groups: Completely different participants in each condition with no matching

The key distinction is whether participants across conditions are connected in some meaningful way (related) or are entirely separate (unrelated). If investigating correlations rather than differences, this factor becomes irrelevant.

chatImportant

Remember: If you're testing for correlation or association, experimental design doesn't matter - you can skip this factor and move directly to considering your level of measurement.

Factor 3: Levels of measurement

Data can be classified into three distinct levels of measurement, each with different characteristics and statistical requirements:

Nominal data

Nominal data consists of categories or labels where items can only belong to one group. Sometimes called categorical data, this represents the most basic level of measurement.

Key features:

Data appears as frequencies within categories
Items are discrete - they can only appear in one category
Cannot calculate meaningful averages or ranges

lightbulbExample

Examples of Nominal Data:

Favourite colours (red, blue, green, yellow)
Preferred transport methods (car, bus, train, bicycle)
Gender categories (male, female, non-binary)
Political party preference (Conservative, Labour, Liberal Democrat)

Ordinal data

Ordinal data involves ranking or ordering items along a scale where the position matters, but the intervals between positions are not equal.

Key features:

Data can be arranged in order from lowest to highest
Intervals between units are not equal in size
Based on subjective judgements rather than objective measurements
Sometimes called "unsafe data" due to lack of precision

For statistical testing, ordinal data is converted to ranks (1st, 2nd, 3rd, etc.) rather than using the original scores, as the raw numbers lack meaningful mathematical properties.

lightbulbExample

Examples of Ordinal Data:

Satisfaction ratings on a 1-10 scale
Agreement levels (strongly disagree, disagree, neutral, agree, strongly agree)
Competition rankings (1st place, 2nd place, 3rd place)
Education levels (GCSE, A-Level, Undergraduate, Postgraduate)

Interval data

Interval data represents the most sophisticated level of measurement, using numerical scales with equal, precisely defined units.

Key features:

Based on standardised units of measurement (time, weight, temperature)
Equal intervals between all points on the scale
Preserves maximum detail and precision
Required for parametric statistical tests

Think of interval data as measurements you could take with scientific instruments like stopwatches, thermometers, or weighing scales - these produce objective, standardised measurements.

lightbulbExample

Examples of Interval Data:

Reaction times measured in milliseconds
Test scores based on objective marking criteria
Physical measurements (height, weight, temperature)
Age measured in years or months

Statistical test selection table

The following table provides a systematic approach to selecting the appropriate statistical test based on your three key factors:

Data Level	Test of Difference (Unrelated)	Test of Difference (Related)	Test of Association/Correlation
Nominal	Chi-squared	Sign test	Chi-squared
Ordinal	Mann-Whitney	Wilcoxon	Spearman's rho
Interval	Unrelated t-test	Related t-test	Pearson's r

infoNote

Key points about the table:

Chi-squared can test both differences and associations, but always requires unrelated/independent data
The three tests in the bottom row (unrelated t-test, related t-test, and Pearson's r) are parametric tests
All other tests are non-parametric tests

Understanding parametric vs non-parametric tests

Parametric tests (unrelated t-test, related t-test, Pearson's r) require interval-level data and make assumptions about the underlying distribution of scores in the population. They are considered more powerful and sensitive to detecting genuine effects.

Non-parametric tests make fewer assumptions about the data and can be used with ordinal or nominal data. They are more robust but generally less sensitive than parametric alternatives.

chatImportant

Choosing Between Test Types:

Use parametric tests when you have interval data and can meet their assumptions - they're more likely to detect real effects
Use non-parametric tests when you have nominal/ordinal data or when parametric assumptions are violated
When in doubt, non-parametric tests are the safer choice

Memory aid for test selection

A useful mnemonic for remembering the sequence of tests in the table can help during exams and research planning:

infoNote

"Carrots Should Come Mashed With Swede Under Roast Potatoes"

This corresponds to:

Chi-squared
Sign test
Chi-squared
Mann-Whitney
Wilcoxon
Spearman's rho
Unrelated t-test
Related t-test
Pearson's r

Data classification challenges

In psychology, classifying data types can sometimes be ambiguous and requires careful consideration of the underlying measurement properties.

chatImportant

Common Classification Dilemma: "Number of words recalled" in a memory test could theoretically be interval data if all words are equally difficult to remember. However, since some words are naturally more memorable than others, it's often safer to treat such data as ordinal and use appropriate non-parametric tests.

Always provide clear reasoning when determining the level of measurement for your data, as this decision directly impacts which statistical test is appropriate. When classification is unclear, err on the side of caution and choose the more conservative (lower level) classification.

bookmarkSummary

Key Points to Remember:

Statistical tests determine whether findings are significant or due to chance
Three factors guide test selection: difference vs correlation, experimental design (related vs unrelated), and level of measurement
Data levels: Nominal data uses categories, ordinal data involves ranking, interval data has equal units
Parametric tests (t-tests, Pearson's r) require interval data and are more powerful
Chi-squared tests associations and always requires independent data
When in doubt about data level, justify your reasoning clearly

Choosing a Statistical Test (AQA A-Level Psychology): Revision Notes