Analysis of Data (Edexcel A-Level Psychology): Revision Notes
Analysis of Data
Chi-squared test
The chi-squared test (χ²) is a statistical test used to analyse nominal data. It examines whether there is an association or difference between two or more independent groups. This test is particularly useful when researchers want to determine if observed patterns differ from what would be expected by chance.
When to use a chi-squared test
A chi-squared test is appropriate in three specific situations:
- Testing for differences or associations: When you have a hypothesis predicting a difference or association between groups. For example, a researcher might predict that boys and girls show differences in their toy choices at nursery.
- Nominal level of measurement: When your data is categorical rather than numerical. For instance, using two categories such as 'stereotypical' and 'non-stereotypical' toys rather than numerical scores.
- Independent measures design: When you have at least two different groups being studied, such as boys and girls, rather than the same participants tested repeatedly.
The chi-squared test is specifically designed for situations where you have categorical data from independent groups. All three conditions above must be met for the test to be appropriate.
Understanding the chi-squared test
The chi-squared test compares two or more independent sets of data to determine if they differ or are related. It uses nominal data, where results are grouped into categories. The test gathers frequency data (how many occurrences within each group) rather than scores or measurements.
The data is organised into a contingency table, which displays the observed frequencies for each category. Chi-squared tests can examine differences or commonalities between two or more groups.
Formula for chi-squared
Where:
- O = observed frequency (the actual data collected)
- E = expected frequency (what would be expected by chance)
- Σ = sum of all the calculations
Calculating chi-squared: step-by-step procedure
Step 1: Create a contingency table
Organise your observed data into a table with rows and columns. A 2 × 2 contingency table has two rows and two columns, though larger tables (such as 3 × 2) are possible depending on your research design.
Step 2: Calculate expected frequency for each cell
Use the formula:
Calculate this for every cell in the contingency table.
Step 3: Subtract expected value from observed value
For each cell, calculate O - E (observed minus expected).
Step 4: Square the result
Calculate (O - E)² for each cell.
Step 5: Divide by expected frequency
For each cell, calculate:
Step 6: Sum all values
Add all the values from Step 5 to obtain the observed chi-squared value (χ²).
Step 7: Calculate degrees of freedom
Use the formula:
For a 2 × 2 table: df = (2-1) × (2-1) = 1
Step 8: Find the critical value
Look up the critical value in a chi-squared table using:
- The degrees of freedom
- Your chosen significance level (typically p = 0.05)
- Whether you have a directional (one-tailed) or non-directional (two-tailed) hypothesis
Step 9: Compare values
Compare the observed chi-squared value with the critical value. If the observed value is equal to or greater than the critical value, the result is statistically significant.
Common mistake to avoid: Unlike some other statistical tests, for chi-squared the observed value must be equal to or greater than the critical value for significance. Don't confuse this with tests where the observed value needs to be less than the critical value.
Worked example: children and toy choice
Worked Example: Analysing Children's Toy Choices
A researcher investigated whether four-year-old children show stereotypical choices in the toys they play with at nursery.
Table: Children and toy choice
| Stereotypical toy | Non-stereotypical toy | Row total | |
|---|---|---|---|
| Girls | 8 (Cell A) | 12 (Cell B) | 20 |
| Boys | 17 (Cell C) | 3 (Cell D) | 20 |
| Column total | 25 | 15 | 40 (Overall total) |
Calculating expected frequencies:
- Cell A = (20 × 25) / 40 = 12.5
- Cell B = (20 × 15) / 40 = 7.5
- Cell C = (20 × 25) / 40 = 12.5
- Cell D = (20 × 15) / 40 = 7.5
Calculating O - E for each cell:
- Cell A = 8 - 12.5 = -4.5
- Cell B = 12 - 7.5 = 4.5
- Cell C = 17 - 12.5 = 4.5
- Cell D = 3 - 7.5 = -4.5
Calculating (O - E)² for each cell:
All cells = 20.25
Calculating (O - E)² / E for each cell:
- Cell A = 20.5 / 12.5 = 1.62
- Cell B = 20.25 / 7.5 = 2.7
- Cell C = 20.25 / 12.5 = 1.62
- Cell D = 20.25 / 7.5 = 2.7
Sum of values:
χ² = 1.62 + 2.7 + 1.62 + 2.7 = 8.64
Degrees of freedom:
df = (2-1) × (2-1) = 1
Critical value (from table):
At p = 0.05 with df = 1 for a one-tailed test, the critical value is 2.71.
Conclusion:
Since the observed value of 8.64 is greater than the critical value of 2.71, we can reject the null hypothesis. This supports the alternative hypothesis that boys and girls differ in their choice of stereotypical and non-stereotypical toys. If the observed value had been less than the critical value, we would have accepted the null hypothesis that toy choice was based on chance.
Critical values table for chi-squared
| Degrees of freedom (df) | 0.05 (one-tailed) | 0.025 (one-tailed) | 0.01 (one-tailed) |
|---|---|---|---|
| Levels of significance for a two-tailed test | |||
| 0.10 | 0.05 | 0.02 | |
| 1 | 2.71 | 3.84 | 5.41 |
| 2 | 4.60 | 5.99 | 7.82 |
When using the critical values table, always check:
- Your degrees of freedom (df)
- Whether your hypothesis is one-tailed or two-tailed
- Your chosen significance level (typically p = 0.05)
Important considerations
Essential points for chi-squared tests:
- Do not convert numbers into averages or percentages when using chi-squared tests; use the total number to obtain reliable results.
- The data must be organised into a contingency table showing observed frequencies.
- Remember that for significance, the observed value must be equal to or greater than the critical value (unlike some other statistical tests).
Key Points to Remember:
- The chi-squared test is used for nominal data with independent measures designs to test for differences or associations.
- The formula χ² = Σ(O - E)² / E compares observed frequencies with expected frequencies.
- Calculate expected frequency using: (row total × column total) / overall total.
- Degrees of freedom = (rows - 1) × (columns - 1).
- The result is significant if the observed value is equal to or greater than the critical value.