Scientific Processes (OCR A-Level Psychology): Revision Notes
7.2.14 Reliability & Validity
Reliability is a measure of consistency. A study is said to be reliable if it can be repeated and produce the same consistent findings every time.
Ways of assessing reliability
| Test-retest | Inter-observer reliability |
|---|---|
| A method of assessing the reliability of a questionnaire or psychological test by assessing the same person, twice, on separate occasions. If the test is reliable then it should produce identical or similar results each time. The two sets of results must be correlated – if the correlation coefficient is >+0.8 then the test is said to be reliable. There must be a sufficient amount of time between the original test and the retest so that the participant does not recall their answers in the first one but not so long that their opinions would have changed. | The extent to which two or more observers agree on a particular set of results. The researchers may decide to run a pilot study to check that observers are applying the correct behavioural categories in the same way. This involves one or more observers/researchers analysing or rating independently. They would then correlate their ratings to see if they agree. If there is a correlation of +0.8 or more, then the results are seen as reliable. |
Improving reliability
| Questionnaires – replace some open questions with more room for ambiguity with closed fixed-choice questions, allowing a consistent interpretation from different researchers. | Interviews – use the same interviewer each time with a similar structure, if this is not possible all interviewers are to be trained to prevent the risk of ambiguous or leading questions. This can be avoided by carrying out structured interviews |
|---|---|
| Experiments – Keeping participant's conditions the same – lab experiments are said to be the most reliable because the researcher can strictly control all aspects of the study. | Observations – ensure behavioural categories have been properly operationalised and do not overlap. |
Validity
Validity is the extent to which an observed effect is genuine – whether results produced are legitimate and whether the researcher has measured what they intended to measure (internal validity) or the extent to which findings can be generalised beyond the research settings in which they were found (external validity).
Internal Validity refers to whether the effects observed are due to the IV and no other factors. One major threat to internal validity is whether participants respond to demand characteristics.
External Validity relates to generalising to other settings, cultures, and periods.
- Ecological validity concerns generalising findings of the study to 'everyday life'.
- Temporal validity concerns the extent to which findings can be generalised to other historical times and eras
Assessment of Validity
| Face Validity | Concurrent Validity |
|---|---|
| whether a scale or measure appears to measure what it is supposed to measure. This can be determined by giving the test to an expert (in that field) to check it. | demonstrated when the results of a test or scale obtained are very close to, or match those obtained on a recognised and well-established test. Close agreement between the two sets of data would indicate that the new test has high concurrent validity. |
Improving Validity
| Experimental research – standardised procedures (to minimise the effect of participant reactivity and investigator effects). Have a control group (to assess whether changes in DV were caused by IV). The use of single-blind and double-blind procedures (to reduce the effect of demand characteristics) | Questionnaires – incorporate a lie scale within questions (to assess consistency and to control for the effects of social desirability bias), and assure participants that their data will remain anonymous. |
|---|---|
| Observations – specific, operationalised behavioural categories with no overlap, covert observations (to allow natural behaviour of participants) | Qualitative methods – depth and detail from other forms of data to support |