Collecting Data (Edexcel GCSE Maths): Revision Notes
Collecting data
Data collection is a fundamental skill in statistics that involves gathering information systematically and accurately. Whether you're conducting a survey or recording observations, the way you collect and organise your data will significantly impact the quality of your results.
Understanding types of data
Before you start collecting data, it's important to understand what type of data you're working with, as this affects how you organise and present it.
Data falls into two main categories:
- Qualitative data consists of words, descriptions, or categories (like favourite colours or types of music)
- Quantitative data consists of numbers and measurements (like heights, ages, or test scores)
Understanding your data type is crucial because it determines how you'll organise your classes, what statistical methods you can use, and how you'll present your results.
Quantitative data can be further classified as:
- Discrete data can only take specific, exact values (like the number of pets someone owns - you can't have 2.5 pets)
- Continuous data can take any value within a range (like height or weight, which can be measured to any degree of precision)
Organising data into classes
When you collect quantitative data, you'll often need to group it into classes or categories to make it easier to analyse and understand. This is particularly important when dealing with large datasets.
The way you create these classes depends on your data type:
For discrete data: You should leave gaps between your classes because the data can only take certain values. For example, if you're collecting data about ages in whole years, you might use classes like 0-19, 20-39, 40-59, ensuring there are no overlapping values.
For continuous data: Your classes should have no gaps because the data can take any value within a range. You might use inequalities to show this, such as 0 ≤ x < 20, 20 ≤ x < 40.
When creating classes, make sure they:
- Don't overlap with each other
- Cover all possible values that could occur in your data
- Include boundary classes like "or over" or "or less" to capture extreme values
Worked Example: Creating Class Boundaries
For discrete data (number of siblings):
- Class 1: 0 siblings
- Class 2: 1 sibling
- Class 3: 2 siblings
- Class 4: 3 or more siblings
For continuous data (height in cm):
- Class 1: 140 ≤ h < 150
- Class 2: 150 ≤ h < 160
- Class 3: 160 ≤ h < 170
- Class 4: 170 ≤ h < 180
Creating effective data collection sheets
A well-designed data collection sheet makes the process of gathering information much smoother and reduces the chance of errors. Your sheet should include:
- Clear column headings that specify exactly what data you're collecting
- A tally column for recording responses as you collect them
- A frequency column to show the total count for each category
- Appropriate class boundaries that match your data type
The tally system is particularly useful because it allows you to quickly record responses and easily count them later. Remember to group your tally marks in fives for easier counting.
Designing questionnaires carefully
Creating a good questionnaire requires careful thought about how you word your questions. Poor questions can lead to biassed results, confusing responses, or data that's difficult to analyse.
There are four key principles for writing effective survey questions:
Make questions clear and easy to understand
Your questions should be straightforward and unambiguous. Avoid using vague terms or confusing language that might be interpreted differently by different people. Instead of asking "How much do you spend on food?" (which could mean weekly, monthly, or yearly), be specific about the time period you're asking about.
Ensure questions are easy to answer
Design your response options carefully to avoid confusion. Make sure your categories don't overlap - if someone could legitimately tick multiple boxes, your data will be difficult to analyse. Also, ensure you provide appropriate ranges that cover all possible answers, including options for people who might not fit into your main categories.
Keep questions fair and unbiased
Avoid leading questions that push respondents towards a particular answer. Questions like "Do you agree that potatoes taste better than cabbage?" are problematic because they assume potatoes do taste better and make it more likely that people will agree. Instead, ask neutral questions that don't suggest a preferred response.
Design questions that are easy to analyse
While open-ended questions can provide rich information, they're much harder to analyse statistically. Where possible, provide specific response options that people can choose from. This makes it much easier to count responses and identify patterns in your data.
Key Points to Remember:
- Data types matter: Discrete data needs gaps between classes, while continuous data should have no gaps
- Plan your classes carefully: Make sure they don't overlap and cover all possible values
- Good questions are clear: Avoid vague language and specify exact time periods or criteria
- Avoid bias: Don't lead respondents towards particular answers with the way you phrase questions
- Think about analysis: Design questions that will give you data you can actually work with and interpret meaningfully