Handling Data Revision Notes for AQA A-Level Biology

Statistical Tests

Statistical tests help biologists determine whether observed differences in data are due to genuine effects or simply random chance. In biology, any probability greater than 5% suggests results could be due to chance alone, while probabilities of 5% or below indicate the data differ significantly and a real cause must be influencing the outcome.

Understanding statistical significance

Statistical significance occurs when the probability that results are due to chance alone is 5% or less (p ≤ 0.05). This means we can be at least 95% confident that observed differences represent real effects rather than random variation.

The null hypothesis assumes there is no significant difference between observed and expected results. Statistical tests help determine whether to accept or reject this null hypothesis based on calculated probability values.

chatImportant

The 5% Rule: Results with p ≤ 0.05 are considered statistically significant, meaning there's less than a 5% chance they occurred due to random variation alone. This is the standard threshold used in biological research.

Chi-squared (χ²) test

The chi-squared test compares patterns in collected data with patterns expected by chance. This test determines how much observed frequencies deviate from expected frequencies.

When to use chi-squared tests

Use chi-squared tests when comparing observed frequencies with expected frequencies, particularly for checking genetic cross results. For example, when testing whether a die is fair by comparing actual throws with expected equal frequencies for each number.

infoNote

Chi-squared tests are particularly useful in genetics for testing whether experimental crosses match predicted Mendelian ratios, such as 3:1 or 9:3:3:1 ratios.

Chi-squared formula and calculation

The formula for the chi-squared test is:

$\chi^2 = \sum \frac{(O-E)^2}{E}$

Where:

O = observed values
E = expected values
Σ = sum of all categories

lightbulbExample

Worked Example: Genetic Cross Analysis

Consider a cross between two heterozygous tall plants where the expected outcome is 3 tall:1 short plant. In reality, 69 tall and 28 short plants were observed from 97 total plants.

Expected values: 72.75 tall plants, 24.25 short plants

Category	Observed (O)	Expected (E)	(O-E)	(O-E)²	(O-E)²/E
Tall plants	69	72.75	-3.75	14.06	0.19
Short plants	28	24.25	3.75	14.06	0.58
Total					χ² = 0.77

Interpreting chi-squared results

Calculate degrees of freedom as the number of categories minus 1. Here: 2 categories - 1 = 1 degree of freedom.

Compare the calculated χ² value (0.77) with the critical value from statistical tables. For 1 degree of freedom at p = 5%, the critical value is 3.84. Since 0.77 < 3.84, the probability is between 10% and 50%, so we accept the null hypothesis - no significant difference exists between observed and expected results.

chatImportant

Critical Point: If your calculated χ² value is greater than the critical value from the table, the result is statistically significant and you reject the null hypothesis.

Student t test

The Student t test judges whether differences between means of two data sets are statistically significant. This test requires normally distributed data with sufficient sample sizes (ideally more than 15 in each group).

When to use t tests

Use t tests when comparing means from two independent groups to determine if observed differences are statistically significant. Sample sizes do not need to be equal between groups.

infoNote

The t test assumes your data follows a normal distribution. Always check this assumption before applying the test, especially with smaller sample sizes.

Student t test formula and calculation

The formula for an unpaired t test is:

$t = \frac{\bar{x\_1} - \bar{x\_2}}{\sqrt{\frac{s\_1^2}{n\_1} + \frac{s\_2^2}{n\_2}}}$

Where:

$\bar{x\_1}$ and $\bar{x\_2}$ = means of each group
$s\_1^2$ and $s\_2^2$ = variances of each group
$n\_1$ and $n\_2$ = sample sizes of each group

lightbulbExample

Worked Example: Limpet Diameter Comparison

Comparing limpet diameters between east-facing and west-facing sites:

Site	n	Mean diameter (mm)	Variance (s²)
East	28	35.64	77.17
West	28	37.36	74.4

Step 1: Calculate the t value

$t = \frac{35.64 - 37.36}{\sqrt{\frac{77.17}{28} + \frac{74.4}{28}}} = \frac{-1.72}{2.33} = 0.74$

Step 2: Calculate degrees of freedom

Degrees of freedom = (n₁ + n₂) - 2 = 54

Step 3: Interpret results

From statistical tables, this t value indicates the probability of difference being due to chance is more than 10%. Therefore, we accept the null hypothesis - no significant difference exists between sites.

Correlation coefficient (Pearson's product moment correlation coefficient)

The correlation coefficient (r) measures the strength and direction of linear relationships between two variables. Values range from -1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 indicating no correlation.

When to use correlation analysis

Use correlation analysis when examining relationships between two continuous variables. Always plot data on scatter graphs first to visualise potential relationships before calculating correlation coefficients.

infoNote

Remember: Correlation does not imply causation. A strong correlation between two variables doesn't necessarily mean one causes the other.

Correlation coefficient formula and calculation

The formula is:

$r = \frac{\sum (x-\bar{x}) \times (y-\bar{y})}{\sqrt{\sum (x-\bar{x})^2 \times \sum (y-\bar{y})^2}}$

Where:

x = values of first variable, $\bar{x}$ = mean of first variable
y = values of second variable, $\bar{y}$ = mean of second variable
Σ = sum of all values

lightbulbExample

Worked Example: Seed Mass and Wrinkles Correlation

Investigating correlation between horse chestnut seed mass and number of wrinkles:

Data: 6 seeds with masses 12g, 10g, 8g, 6g, 4g, 2g and wrinkles 1, 3, 8, 15, 27, 36 respectively.

After tabulated calculations: r = -254/261 = -0.97

Interpretation: This indicates a strong negative correlation between seed mass and number of wrinkles.

Degrees of freedom = n - 2 = 4

From correlation tables, r = 0.97 at 4 degrees of freedom gives p < 0.001, meaning the correlation is 99.9% certain to be real.

Using statistical tables effectively

Understanding how to calculate and use degrees of freedom is crucial for interpreting statistical results correctly.

Degrees of Freedom Calculations:

Chi-squared: number of categories - 1
t test (unpaired): (n₁ + n₂) - 2
Correlation: n - 2

Compare calculated values with critical values from appropriate statistical tables. If calculated values exceed critical values at p = 0.05, results are statistically significant.

chatImportant

Table Usage Tip: Always convert negative t or r values to positive when using tables, as statistical tables show absolute values only.

bookmarkSummary

Key Points to Remember:

Statistical significance occurs when p ≤ 0.05 (5% probability or less)
Chi-squared tests compare observed vs expected frequencies using $\chi^2 = \sum \frac{(O-E)^2}{E}$
Student t tests compare means between two groups and require normally distributed data
Correlation coefficients measure relationship strength between variables, ranging from -1 to +1
Always calculate correct degrees of freedom and use appropriate statistical tables to interpret results

Handling Data (AQA A-Level Biology): Revision Notes

Statistical Tests

Understanding statistical significance

Chi-squared (χ²) test

When to use chi-squared tests

Chi-squared formula and calculation

Interpreting chi-squared results

Student t test

When to use t tests

Student t test formula and calculation

Correlation coefficient (Pearson's product moment correlation coefficient)

When to use correlation analysis

Correlation coefficient formula and calculation

Using statistical tables effectively

Explore AQA A-Level Biology Model Answers by Topics

Arithmetic & Numerical Computation

Handling Data

Graphs

Geometry & Trigonometry

Explore AQA A-Level Biology Quizzes by Topics

Arithmetic & Numerical Computation

Handling Data

Graphs

Geometry & Trigonometry

Explore AQA A-Level Biology Flashcards by Topics

Arithmetic & Numerical Computation

Handling Data

Graphs

Geometry & Trigonometry

Explore AQA A-Level Biology Exam Questions by Topics

Arithmetic & Numerical Computation

Handling Data

Graphs

Geometry & Trigonometry

Join 100,000+ A-Level students studying Revision Notes with us.