Practical Investigation (Edexcel A-Level Psychology): Revision Notes
Guide to the Practical Investigation
Learning outcomes
By the end of this practical investigation, you should be able to:
- Collect observational data effectively
- Analyse both quantitative and qualitative data appropriately
- Apply the chi-squared test to research findings
- Present findings in suitable formats
- Link practical observations to learning theories
- Critically evaluate observational research
Overview: Observing gender differences in behaviour on public transport
Observational research enables psychologists to gather both quantitative and qualitative data in real-world settings. This practical investigation focuses on prosocial and polite behaviours displayed by men and women on public transport (such as buses or trains).
Prosocial behaviour refers to any action intended to benefit another person. Examples on public transport include thanking the driver, offering a seat to someone in need, helping passengers with pushchairs, engaging politely with other passengers, or creating space for others on crowded vehicles.
This investigation replicates Bandura's finding that children observing aggressive role models later exhibit antisocial behaviour. By conducting this observational study, you can explore whether social learning theory also explains the modelling of prosocial behaviours in society.
Students typically conduct two observations: one quantitative and one qualitative, or a combined observation gathering both data types simultaneously. Observations in psychology can serve as a technique within an experiment or as a standalone research method. In this instance, observation functions as the primary research method.
Formulating hypotheses
The aim is to investigate gender differences in behaviours on public transport, specifically prosocial behaviours and politeness. When conducting quantitative observational research, prosocial behaviours must be clearly measurable. Your hypothesis needs to specify whether it is directional (one-tailed) or non-directional (two-tailed).
Researchers typically use a non-directional hypothesis when uncertain about the effect's direction. Conversely, if previous studies provide insight into the likely outcome, a directional hypothesis may be appropriate. Regardless of which type you choose, the hypothesis must be clearly testable.
For this investigation, based on mixed background literature and research on gender and prosocial behaviour, a non-directional hypothesis is proposed: there will be a difference in the number of males and females who thank the bus driver before alighting.
Background and links to learning theories
Social learning theory offers an explanation for gender differences in society. It proposes that we acquire gender-appropriate behaviours from people around us, including parents, teachers, siblings, peers and the media. These role models demonstrate how to behave and also reinforce gender-appropriate behaviours through rewards and punishments. Consequently, the theory can explain stereotypes in society, such as women being kind and helpful, and perhaps more polite and friendly towards people generally. These behaviours may be modelled and rewarded in females more than males.
Research Finding: Eagly (2009) found that women and men both engage in prosocial behaviours equally overall. However, women tend to engage in more communal and relational prosocial behaviours, whereas men tend to display more agentic prosocial behaviours. This means men engage in prosocial acts to gain status or demonstrate strength to others.
Similarly, research by Leslie, Snyder and Glomb (2013) examining workplace donations to charity indicated that women donated more than men. These findings suggest that women may demonstrate more helpful and polite behaviours on public transport than men. Examples include offering a seat, assisting with a pram, polite interactions with drivers and fellow passengers, or making space for others to pass through on crowded buses.
Given that background literature and research on gender and prosocial behaviour presents mixed findings, a non-directional hypothesis is appropriate: there will be a difference in the number of males and females who thank the bus driver before alighting.
Planning an observation
When planning your quantitative observation, consider the statistical test you will use to analyse your data. In this case, participants are distinguished by category – male and female. They can only belong to one of these categories, making the data unrelated.
Before conducting your observation, you will have several design decisions to make. You will likely undertake a naturalistic observation, observing participants in a natural setting. Decide when and where you will conduct your observations, including factors such as:
- Which bus or train route
- The day of the week
- The time of day
- The duration of your journey
All these factors influence how many passengers will be present and the type of passenger travelling.
This example practical observation will be a naturalistic observation of people on a bus, using opportunity sampling as the sampling technique to recruit participants.
Consider carefully the type of behaviour you will observe. Avoid overcomplicating matters – focus on one observable behaviour that you can easily focus on and count. For instance, you may decide to measure polite behaviour by recording whether or not bus passengers thank the driver before alighting. Other behaviours could include assisting other passengers, offering a seat, or making space for others on a crowded bus. Once you have decided which prosocial behaviours are appropriate to investigate on a bus journey, draw a chart to tally each time a passenger displays such behaviour.
Recording observations
Create a simple recording sheet to document your observations. An example might include:
| Behaviour | Male | Female |
|---|---|---|
| Thanking the bus driver | ||
| Giving up a seat on the bus | ||
| Helping a passenger with a pushchair |
As you count instances of polite or prosocial behaviour occurring, you can tally the observations. This quantitative data can be tested for significance using an inferential statistical test. Your observation can also gather qualitative data by recording what you observed in greater detail. For instance, note what passengers actually said, or annotate your observations in your notepad about whether the bus driver was male or female, the age of participants, or whether there was eye contact.
Pilot study
Before conducting your observation, consider giving it a 'dry run' through a pilot study to address any potential problems and test your observational criteria. For instance, is the best position to observe behaviours at the front of the bus? If you are observing interactions with the driver, sitting closer to the front would be advantageous. Determine how long the journey needs to be to provide sufficient data. At busy times such as rush hour, can you observe everyone?
A pilot study is not used to test whether an investigation is valid or reliable; it is only used to check that apparatus, timings, and so on work correctly. A pilot study provides a useful opportunity to ask participants whether they felt uncomfortable, suspected anything unusual, or felt their behaviour was influenced in any way.
Controls
You might consider recruiting someone to assist you, for example with one person observing male passengers and another observing female passengers. However, you would need to consider factors such as inter-rater reliability (the degree of agreement between two or more observers) to ensure consistency in your observations.
You are both likely to be covert (participants are unaware they are being observed), but it would be worth checking if passengers pay attention to your note-taking and become suspicious, which could pose a confounding variable for your observations.
Ethical considerations
Observations in public places are considered ethical because you might reasonably expect to be observed by others in your day-to-day interactions. This includes public transport such as buses and trains. As you are not collecting names or any other personal data – only the gender and the observed behaviour – you will not be contravening ethical guidelines.
Analysing quantitative data
This practical investigation example collected data on an hour's bus journey between 3.00 and 4.00 pm on a Thursday afternoon. In total, 51 passengers were observed: 24 females and 27 males. It tallied the number of males and females according to whether they thanked the bus driver or not.
You could present your data in preparation for a chi-squared test as follows:
Table: Presentation of data
| Male | Female | Total | |
|---|---|---|---|
| Thanked the bus driver | 11 (cell A) | 18 (cell B) | 29 |
| Did not thank the bus driver | 16 (cell C) | 6 (cell D) | 22 |
| Total | 27 | 24 | 51 |
The table shows that 11 males and 18 females thanked the bus driver when alighting, whereas 16 males and 6 females did not thank the driver on alighting. This is known as a two-by-two contingency table, which is suitable for nominal data.
From analysing the table, you will see that there is a fairly equal frequency of male and female participants observed on the bus journey. There is also a small difference in the number of male and female passengers who thanked the driver.
What do the results suggest in relation to the initial hypothesis? Is the difference significantly large? It is difficult to tell if the results are significant or due to chance, so an inferential test can help determine if we can accept our experimental hypothesis and reject our null hypothesis.
Carrying out a chi-squared test
The chi-squared test is the most appropriate statistical test in this circumstance as the study is predicting a difference, it is using unrelated data, and we have nominal level data.
Worked Example: Conducting a Chi-Squared Test
The following procedures apply to conducting a chi-squared test:
Step 1: Calculate expected values
You first need to calculate the expected values against the observed values. The observed values are the ones in our contingency table – 11, 18, 16, 6. The expected values need to be calculated for each cell in our table by working out how the data would be distributed if there were no differences in the pattern, that is no difference between males and females in thanking the bus driver. This is done by using the totals for each row or column.
Expected value = row total × column total / overall total
| Cell | E = row × column/total |
|---|---|
| A | |
| B | |
| C | |
| D |
Step 2: Calculate chi-squared value
You will now need to take the expected value (E) from the observed value (O) for each of the cells and square the result. Then you divide that result by the expected value (E). Finally, add the four results to find the overall chi-squared result.
| Cell | Calculation |
|---|---|
| A | |
| B | |
| C | |
| D | |
| Total |
Step 3: Find the critical value
Find the critical value for chi-squared by first calculating the degrees of freedom (df). This is done by multiplying (rows-1) × (columns-1) of your table. In our two-by-two contingency table, this means our df = 1.
Compare the overall observed value against the critical values table. For a one-tailed hypothesis, with df=1, the critical value at a significance level of is 2.71. As the observed value of 6.07 is greater than the critical value in the table (2.71), the result is significant and the null hypothesis can be rejected.
This means that there is less than a 5 per cent probability that the difference in prosocial behaviour displayed by males and females is due to chance. The direction of this difference can be established by examining the cells/totals. In this case, females travelling on the bus were more likely to thank the bus driver and males were not.
Critical values table for chi-squared test:
| Level of significance for a one-tailed test | |||||
|---|---|---|---|---|---|
| 0.10 | 0.05 | 0.025 | 0.01 | 0.005 | |
| df | Level of significance for a two-tailed test | ||||
| 0.20 | 0.10 | 0.05 | 0.025 | 0.01 | |
| 1 | 1.64 | 2.71 | 3.84 | 5.02 | 6.64 |
| 2 | 3.22 | 4.61 | 5.99 | 7.38 | 9.21 |
| 3 | 4.64 | 6.25 | 7.82 | 9.35 | 11.35 |
| 4 | 5.99 | 7.78 | 9.49 | 11.14 | 13.28 |
| 5 | 7.29 | 9.24 | 11.07 | 12.83 | 15.09 |
The observed/calculated value must equal or exceed the critical value to be significant at the level shown.
Type 1 and Type 2 errors
A problem for inferential statistics can be Type 1 and Type 2 errors. In psychology, it is common practice to use a significance level of 5% or (). However, sometimes we may accept or reject the null hypothesis when we should not have.
Understanding Statistical Errors:
A Type 1 error involves rejecting a null hypothesis that is in fact true. Typically, this error is made when the level of significance is set too leniently or at a higher level such as 10% or (). This runs the risk of accepting our results as significant when they are not.
A Type 2 error, on the other hand, involves accepting a null hypothesis that is not true. This is more likely when we set our significance level too stringently or at a lower level such as 1% or ().
Analysing qualitative data
Unlike quantitative data, qualitative data can be hard to summarise in a chart or graph. Typically, the analysis of qualitative data involves recognising repeated themes. The technical term for this type of approach is thematic analysis. Braun and Clarke (2006) outline a five-phase approach to the structure of a thematic analysis as follows:
Worked Example: Thematic Analysis in Five Phases
Phase one – Familiarisation with the data: While on the bus journey, you may have noticed in your observations that when males and females thank the bus driver they do it in a different manner. Male passengers were observed to say 'Thanks mate' or 'Thank you driver'. They were also more likely to engage in 'small talk' with the driver before alighting, for example 'What time does your shift finish?' Females, on the other hand, were more likely to say thank you followed by a simple departure greeting such as 'goodbye' and were more likely to make eye contact with the driver when alighting. Therefore, you have begun to notice things or themes that might be relevant to the research question.
Phase two – Generating initial codes: A label or code is given to any specific categories identified such as 'Thank you with familiarity', 'Thank you with no familiarity', 'departure greeting', 'Smalltalk', etc.
Phase three – Searching for themes: In this phase a researcher would seek out themes on the basis of initial labels/codes for some meaning. In our example, 'implied familiarity' could be one such category suggesting that passengers make 'small talk' with the driver or call them mate.
Phase four – Reviewing themes: Here, the researcher tries out these categories. This could mean collecting another set of data to see if future observations fit within them. If the research suggests that it does, then this could mean it is a topic area for the researcher to investigate further.
Phase five – Defining and naming themes: The researcher has clearly defined their themes, which allows them to select information and analyse it against the themes.
Finally, a report is produced in relation to the categories identified, which tells a story about the emerging themes identified.
Unlike quantitative data, qualitative data presents observations, thoughts, etc. that are not always easily counted. It does, however, provide much richer detail on the complexities of human interaction and behaviour. Nevertheless, it is sometimes difficult to select patterns and draw firm conclusions. Qualitative analysis is also more likely to be subjective, perhaps reflecting the personal viewpoints and background of the researcher.
Conclusions
The outcome of the inferential test informs the overall conclusion of the observation. In this instance, the chi-squared test would suggest the results of the experiment are significant so the null hypothesis is rejected and the experimental hypothesis accepted – females are more polite than males on public transport.
In drawing conclusions in relation to the qualitative data, you will need to look at the themes in the observation. You could also illustrate how themes are demonstrated by providing specific examples to support your interpretation. For example, if you observe more 'small talk' with the driver and male passengers, you could provide specific quotes or observations to back this up.
Evaluating the practical investigation
In evaluating your overall observational study you will need to consider its validity, reliability, generalisability and credibility.
Validity
In considering the validity of your observations you need to consider the setting in which it took place. Undertaking a naturalistic observation is real life and would suggest that a certain degree of validity exists. You will also need to consider how objective or subjective your observational categories are. In the case of observing 'thank you', this is a relatively objective measure as it does not involve any kind of judgement. Qualitative data on the other hand can reflect the personal view or interpretation of the observer. What is seen as 'small talk' for instance involves more interpretation on the part of the observer.
Respondent Validation:
One way of improving the validity of qualitative data is respondent validation. There are many different forms of respondent validation but one way of checking your interpretation of your results is to gain feedback from participants in your sample. This is difficult in naturalistic observation but a researcher could interview regular users of public transport to ascertain their views on the interpretation of results. This allows interviewees to cast a critical eye over the findings and comment on them in relation to their own opinions, feelings, and experiences. If participants are generally in agreement then this affirms the validity of your interpretation.
Triangulation is another method for ascertaining reliability and validity of both qualitative and quantitative investigations. It refers to the use of more than one approach in the investigation of a research question. The use of a single research method may suffer from the limitations associated with that particular method. Triangulation therefore offers the opportunity to check results and in doing so provides increased confidence in the findings if similar results are gained via other methods.
Types of Triangulation:
There are many ways of triangulating research methods and data:
-
Investigator triangulation uses one or more researchers to gather and interpret data. Therefore, in checking the validity and reliability of this observational study, a researcher could check the results with another researcher or another bus to see if they achieve similar results.
-
Methodological triangulation utilises one or more than one method for gathering data. A researcher may use a questionnaire with bus passengers to ascertain their views on politeness on public transport to see if they correspond with the research question.
Reliability
How easy is it for you or another observer to repeat the study and would it lead to similar findings if the study was conducted on another bus? Or in another town? Or at another time of day? This can only be determined by repeating the observation numerous times and comparing the data.
For qualitative data, reliability is harder to assess due to its subjective nature. However, triangulation (mentioned in the paragraph above) could be one possible method for assessing reliability. Other methods available to researchers are inter-rater reliability and the test-retest method.
If there are two or more observers, inter-rater reliability gives a measure as to how much agreement there is between researchers when conducting an observation. If researchers are in agreement about the behavioural categories, and note similar observations in similar circumstances, this would suggest there is reliability. However, a lack of consensus may mean that the behavioural categories used may need to be revised to ensure reliability.
Alternatively, in the test-retest method, a researcher could conduct the observation and then conduct it again a day, week or month later on the same bus route to see if similar results are achieved.
Generalisability
In assessing if you can generalise from your observations you will need to consider your sample. The time of day at which you conducted your observations may affect the sample achieved. The ages of the participants also need to be considered. An equal distribution of ages within the sample will minimise bias.
For instance, a larger proportion of retired people on the bus may result in more polite interactions being observed as such participants may be less time pressured and hold more traditional values of respect and politeness.
Credibility
In summary, how credible is your experiment? Science would suggest that the credibility of the research is dependent on how the research meets the scientific principles mentioned earlier in the topic such as replicability, measurable phenomena, etc. At the very least, you should weigh up the overall strengths and weaknesses of your study.
On one hand, a naturalistic observation is a valid form of measurement. However, your observation is likely to be carried out at a certain time of day and therefore it will influence the sample achieved, which can pose potential problems when generalised to other populations and situations. A number of repeated observations may be required to assess its overall effectiveness.
Key Points to Remember:
- Observational research can collect both quantitative and qualitative data to investigate gender differences in prosocial behaviour.
- Quantitative data is analysed using the chi-squared test when data is nominal and participants fall into distinct categories.
- The chi-squared test involves calculating expected values, computing the test statistic, and comparing it to critical values based on degrees of freedom.
- Qualitative data is analysed using thematic analysis, which involves familiarisation, generating codes, searching for themes, reviewing themes, and defining themes.
- Evaluation of observational studies must address validity, reliability, generalisability and credibility, considering factors such as the naturalistic setting, inter-rater reliability, triangulation, and sample characteristics.