Categorical Data (Leaving Cert Mathematics): Revision Notes
Categorical Data
What is categorical data?
Categorical data is information that fits into groups or categories rather than having numerical values. When you ask someone "What colour is your car?", their answer will be a category like blue, red, black, or white, not a number.
This type of data allows us to sort and group information based on shared characteristics or qualities. It's one of the fundamental ways we can classify the data we collect in statistics.
Think of categorical data as placing items into labelled boxes - each piece of data goes into a specific category or group, rather than being measured on a numerical scale.
Examples of categorical data
Understanding categorical data becomes easier when you see real-world examples. Common examples of categorical data include:
- Gender: male or female
- Country of birth: Ireland, France, Spain, Nigeria
- Favourite sport: soccer, hurling, tennis, basketball
These categories help us understand patterns and make comparisons between different groups. Notice how none of these examples involve numbers - they're all about placing items into distinct groups.
Ordinal data
Ordinal data is a special type of categorical data where the categories have a natural order or ranking. The word "ordinal" comes from "order", which helps us remember this concept.
Examples of ordinal data include:
- Type of house: 1-bedroom, 2-bedroom, 3-bedroom (ordered by size)
- Attendance at football matches: never, sometimes, very often (ordered by frequency)
- Opinion scales: strongly disagree, disagree, neutral, agree, strongly agree (ordered by level of agreement)
Key Distinction: The key difference is that ordinal data has a meaningful sequence, while regular categorical data doesn't have any particular order. You can rank ordinal categories from lowest to highest, but you cannot rank regular categorical data.
Univariate data
Univariate data refers to information where only one piece of data is collected from each person or item in your study. The prefix "uni" means "one", making this easy to remember.
Examples of univariate data include:
- Colour of eyes: collecting just eye colour from each person
- Distance from school: measuring only how far each student lives from school
- Height in centimetres: recording just the height of each individual
When you're collecting univariate data, you focus on one specific characteristic at a time.
Remember: Even though you might collect data from many people, if you're only measuring one thing about each person (like just their height, or just their favourite colour), it's still univariate data.
Bivariate data
Bivariate data contains two pieces of information collected together. This is also called paired data because the two measurements are linked to the same person or item.
Examples of bivariate data include:
- Hours of study per week and exam marks: both pieces of information from the same student
- Age of a car and its price: two measurements about the same vehicle
- Engine size and fuel efficiency: both characteristics of the same car
Special types of bivariate data
Bivariate data can be further classified into specific types:
- Bivariate categorical data: when both pieces of information are categorical (e.g., hair colour and gender)
- Bivariate discrete data: when both pieces of information involve counting (e.g., number of rooms in a house and number of children in the house)
The key to identifying bivariate data is that you're collecting exactly two related measurements from the same source - whether that's a person, object, or event.
Worked examples
Worked Example 1: Identifying data types
Question: Classify each type of data as numerical or categorical:
- The sizes of shoes sold in a shop
- The colours of shoes sold in a shop
- The subjects offered to Leaving Certificate students
Solution:
- Shoe sizes: Numerical (sizes like 6, 7, 8, 9 are numbers)
- Shoe colours: Categorical (colours like black, brown, white are categories)
- LC subjects: Categorical (subjects like Maths, English, History are categories)
Worked Example 2: Identifying ordinal data
Question: Which of these is ordinal data?
- Hair colour (blonde, brown, black)
- Class division (first division, second division, third division)
Solution: Class division is ordinal data because the divisions have a clear order from best (first) to worst (third). Hair colour has no natural ordering.
Worked Example 3: Distinguishing univariate and bivariate data
Question: Identify whether collecting "student's favourite subject" is univariate or bivariate data.
Solution: This is univariate data because you're only collecting one piece of information (favourite subject) from each student.
Exam tips
Essential Exam Strategies:
- Remember the definitions: Categorical data goes into categories, ordinal data has order
- Look for keywords: "favourite", "colour", "type" often indicate categorical data
- Count the variables: One measurement = univariate, two measurements = bivariate
- Check for natural order: If categories can be ranked (like never, sometimes, always), it's ordinal data
Key Points to Remember:
- Categorical data fits into groups or categories rather than being numerical
- Ordinal data is categorical data with a natural order or ranking
- Univariate data involves collecting one piece of information per individual
- Bivariate data involves collecting two related pieces of information together
- Always consider whether categories have a meaningful order when deciding if data is ordinal