Primary and secondary data (AQA GCSE Statistics): Revision Notes
Primary and secondary data
When conducting research or statistical investigations, it's essential to understand how data can be categorised. All data falls into two main categories: primary data and secondary data. Understanding the difference between these types will help you choose the most appropriate data collection method for your investigation.
The distinction between primary and secondary data is fundamental to all statistical investigations. Your choice of data type will significantly impact the reliability, cost, and time requirements of your research project.
What is primary data?
Primary data refers to information that you gather yourself through direct collection methods. This means you are the original source of the data, and you have complete control over how it is collected.
Methods of collecting primary data
There are several ways you can collect primary data:
- Experiments: Setting up controlled conditions to test a hypothesis
- Surveys: Asking people questions about their opinions or experiences
- Questionnaires: Using structured forms to gather specific information
- Observations: Recording what you see or measure directly
- Interviews: Speaking directly with people to gather information
The key characteristic of primary data is that you design the collection method specifically for your research question, which means you can tailor it exactly to what you need to find out.
What is secondary data?
Secondary data comes from sources that have already been published or compiled by other people or organisations. This information was originally collected for different purposes but can be useful for your investigation.
Sources of secondary data
Common sources of secondary data include:
- Newspapers and magazines: Articles containing statistics or research findings
- Books and textbooks: Published research and data compilations
- Internet websites: Online databases and research publications
- Government publications: Official statistics from organisations like the UK Office for National Statistics
- Academic journals: Research papers containing data from studies
When using secondary data, you're essentially using information that someone else has already gathered and made available to the public.
Comparing primary and secondary data

Understanding the advantages and disadvantages of each data type helps you make informed decisions about which to use in different situations.
Advantages of primary data
Primary data collection offers several benefits. You know exactly how the information was gathered because you designed and carried out the collection process yourself. This gives you confidence in the accuracy of your data. You can also design your questionnaires or surveys specifically to answer your research questions, ensuring the data is perfectly suited to your needs.
Control and Accuracy
With primary data, you have complete control over the collection process, which means you can ensure the data meets your exact requirements and maintains high quality standards.
Disadvantages of primary data
However, collecting primary data can be quite challenging. It often requires a significant amount of time to plan, carry out, and analyse your data collection. It can also be expensive, especially if you need to reach many people or use specialised equipment. You might need to print questionnaires, travel to collect data, or pay participants.
Time and Cost Considerations
Primary data collection is typically the most time-consuming and expensive option. Always consider whether your research timeline and budget can accommodate these requirements before choosing this approach.
Advantages of secondary data
Secondary data is generally much easier and cheaper to obtain than primary data. You can often find reliable sources quickly, especially from established organisations like government departments. The UK Office for National Statistics, for example, provides trustworthy data that has been professionally collected and verified.
Disadvantages of secondary data
The main challenge with secondary data is that it might not perfectly match your research needs. The information might not be reliable if the source is questionable, or it might contain errors that you cannot verify. Since you didn't collect it yourself, you don't know exactly how it was gathered, which could affect its suitability. The data might also be outdated or not specific enough to answer your particular research question.
Reliability and Suitability Concerns
Always critically evaluate secondary data sources. Consider the reputation of the organisation, the age of the data, and whether the collection methods align with your research needs.
Worked examples: identifying data types
Let's look at some real-world scenarios to help you identify whether data is primary or secondary:
Worked Example 1: Public opinion survey
Karen wants to investigate public opinion about a new leisure centre. She decides to gather information by asking people living in the local area what they think about the proposed development.
Answer: This is primary data because Karen is collecting the information herself directly from the source.
Worked Example 2: Exam results analysis
A teacher wants to know how well her students performed in their GCSE Statistics exam. She obtains the data by looking at the official exam results provided by the exam board website.
Answer: This is secondary data because the information has been collected and published by an organisation (the exam board), not by the teacher herself.
Worked Example 3: Furniture design research
Larry is designing furniture and needs data about the heights of people sitting at furniture. He finds published research on the internet that provides measurements for desk dimensions.
Answer: This is secondary data because Larry is using information that has already been researched and published by someone else on the internet.
Worked Example 4: Plant growth experiment
Gav wants to decide which type of tomato plant to grow in his garden. He conducts an experiment where he grows different varieties and measures the weight of tomatoes produced by each type.
Answer: This is primary data because Gav is conducting his own experiment and collecting the measurements himself.
Key Points to Remember:
- Primary data is information you collect yourself through experiments, surveys, questionnaires, or observations
- Secondary data comes from published sources like books, newspapers, websites, or government statistics
- Primary data takes more time and money but is tailored specifically to your research question
- Secondary data is quicker and cheaper to obtain but might not perfectly match your needs
- Always consider the reliability and suitability of your data source, regardless of whether it's primary or secondary