Describing data (Edexcel GCSE Statistics): Revision Notes
Describing data
When working with statistics, it's crucial to understand the different types of data you might encounter. This knowledge forms the foundation for choosing appropriate methods to analyse and present your findings.
Understanding raw data
Raw data refers to information that has been collected directly during a statistical investigation, before any sorting or processing has taken place. Think of it as the "unpolished" version of your data - exactly as it was first recorded. This could include measurements like the height of students in your class, the colours of cars in a car park, or the results from a survey about favourite foods.
Data is always described in terms of the variables being collected. A variable is simply a characteristic that can change or vary between different observations. For example, if you're studying people at a football match, your variables might include age, gender, length of time they've been supporting the team, or their favourite player.
A variable is any characteristic that can change between different observations. Understanding your variables is essential for choosing the right statistical methods to analyse your data.
Types of data
All data can be classified into two main categories, each with their own characteristics and uses.
Quantitative data
Quantitative data consists of numerical values that represent measurements or counts. This type of data deals with "how much" or "how many" of something. Examples include height, weight, temperature, number of goals scored, or exam marks.
Quantitative data can be further divided into three subcategories:
Continuous data can take any value within a given range, including decimal places. Temperature is a perfect example - it could be 20.5°C, 20.51°C, or any value in between. The precision only depends on your measuring instrument.
Discrete data can only take specific, separate values, usually whole numbers. Shoe sizes are discrete because you can have size 7 or size 8, but nothing in between these standard sizes.
Ordinal data represents positions or rankings in order. This includes things like position in a race (1st, 2nd, 3rd) or grades (A, B, C, D). The order matters, but the gaps between values aren't necessarily equal.
Quick Memory Aid:
- Continuous: Can be measured to any precision (height, weight, time)
- Discrete: Separate, countable values (number of siblings, goals scored)
- Ordinal: Has a natural order or ranking (grades, satisfaction levels)
Qualitative data
Qualitative data describes qualities or characteristics rather than numerical amounts. This type of data answers questions like "what type" or "which category." Examples include hair colour, type of car, favourite subject, or nationality.
Qualitative data is also known as categorical data because it sorts observations into different categories or groups. For instance, when collecting data about holiday destinations, you might categorise responses by continent: Europe, Asia, North America, South America, Africa, Australia, and Antarctica.
Even if numbers are used in categorical data (like postcodes or team numbers), they're just labels rather than quantities you can calculate with. The key question is: "Does the number have mathematical meaning?"
Working with related data
Often in statistical investigations, you'll collect information about multiple variables for each observation. Understanding how many variables you're working with helps determine which analytical techniques to use.
Bivariate data involves collecting pairs of related measurements for each observation. A common example would be recording both exam results and time spent studying for each student. This allows you to investigate whether there's a relationship between the two variables.
Multivariate data involves collecting three or more related measurements for each observation. For example, you might record age, height, and weight for each person in your sample. This gives you a more complete picture but requires more sophisticated analysis techniques.
The number of variables you collect determines your analysis options:
- Univariate: One variable (simple descriptions, averages)
- Bivariate: Two variables (relationships, correlations)
- Multivariate: Three or more variables (complex relationships, patterns)
Worked example: Identifying data types
Worked Example: Classifying Data Types
Question: Classify each of the following data sets:
- Number of cars in a car park
- Shoe size
- Position in a class test
- Style of a painting
- Height and weight
Step-by-step solution:
Step 1: Number of cars in a car park This is numerical data that can only take whole number values (you can't have 2.5 cars). Answer: Quantitative discrete data.
Step 2: Shoe size Although these are numbers, shoe sizes only come in specific standard sizes (6, 7, 8, etc.). Answer: Quantitative discrete data.
Step 3: Position in a class test This involves rankings (1st place, 2nd place, etc.) where the order is meaningful. Answer: Ordinal data.
Step 4: Style of a painting This describes categories or types (abstract, realistic, impressionist, etc.) rather than numerical values. Answer: Categorical data.
Step 5: Height and weight This involves two numerical measurements taken from the same observations. Answer: Bivariate data.
Common Exam Pitfalls to Avoid:
- Numbers aren't always quantitative! House numbers, phone numbers, and player numbers are actually categorical because they don't represent quantities
- Don't confuse discrete with ordinal: Discrete data involves counts or measurements, while ordinal data involves rankings
- Remember the precision test: If you can measure something more precisely with better equipment, it's continuous data
- Count your variables carefully: Make sure you identify whether you're dealing with one variable (univariate), two variables (bivariate), or more (multivariate)
Key Points to Remember:
- Raw data is information collected directly during an investigation before any processing
- Quantitative data uses numbers with mathematical meaning, while qualitative data describes categories
- Quantitative data can be continuous (any value), discrete (specific values), or ordinal (ranked positions)
- Bivariate data involves pairs of measurements, while multivariate data involves three or more measurements
- Always consider whether the numbers in your data represent actual quantities or just labels when classifying data types