Relational Databases (OCR A-Level Computer Science): Revision Notes
Relational Databases
Overview
A database is an organised collection of structured data, designed to store, manage, and retrieve information efficiently. Databases are widely used in applications to hold large amounts of data in a structured way, making it easy to search, update, and maintain. A relational database specifically organises data into tables, which can be linked or "related" to one another.
This note will explain database terminology, the differences between flat file and relational databases, and the benefits and limitations of each approach.
Basic Database Terminology
- Table: A structure within a database that organises data in rows and columns. Each table represents an entity (e.g., "Students" or "Courses").
- Field: A single attribute or piece of information within a table. For example, in a "Student" table, fields could be "Student_ID," "Name," and "Date_of_Birth."
- Record: A row in a table that represents a single item or entity. For example, a record in the "Student" table would contain all the information about a single student (such as their ID, name, and date of birth).
- Primary Key: A unique identifier for each record in a table, ensuring that each record is unique (e.g., "Student_ID" could be the primary key in a "Student" table).
- Foreign Key: A field in one table that links to the primary key of another table, allowing relationships to form between tables. For instance, "Student_ID" in a "Grades" table could link to "Student_ID" in a "Students" table.
Flat File vs. Relational Databases
Flat File Database:
-
A flat file database stores all data in a single table or file, often in a format like a spreadsheet or text file (e.g., CSV).
-
All data is contained in one place, with fields organised in columns and records in rows. Benefits:
-
Simple structure, suitable for small amounts of data.
-
Easy to implement and requires minimal resources. Limitations:
-
Redundant data: Duplicate data is common because relationships between records are not represented.
-
Difficult to scale: As data grows, flat files become harder to manage and maintain.
-
Poor data integrity: Data can be inconsistent, with no standard way to ensure data accuracy.
Relational Database:
-
A relational database organises data into multiple tables with defined relationships between them.
-
Uses a structure based on primary and foreign keys to link tables, reducing data redundancy and ensuring data integrity. Benefits:
-
Reduces redundancy by storing related data in separate tables, preventing duplicate data.
-
Enhances data integrity and consistency through relationships and constraints.
-
Scalable and efficient, allowing complex queries and easy maintenance of large datasets. Limitations:
-
More complex to set up, requiring careful planning and understanding of data relationships.
-
Can be resource-intensive, as relational databases require more processing power and memory than flat files.
Key Characteristics of Relational Databases
- Normalisation: Process of structuring a relational database to minimise redundancy and dependency by organising fields and records into separate tables.
- Relationships: Data is connected across tables using primary and foreign keys, representing one-to-one, one-to-many, or many-to-many relationships.
Examples
Flat File Database Example
Consider a single "Students" file with all student information, including course enrolment:
| Student_ID | Name | Course | Enrollment_Date |
|------------|------------|--------------|------------------|
| 101 | Alice Smith| Mathematics | 2023-09-01 |
| 102 | Bob Jones | Biology | 2023-09-01 |
| 103 | Alice Smith| Biology | 2023-09-02 |
In this example, "Alice Smith" appears twice, leading to redundancy.
Relational Database Example
In a relational database, you could have two tables: "Students" and "Courses." Students Table
| Student_ID | Name |
|------------|------------|
| 101 | Alice Smith|
| 102 | Bob Jones |
Courses Table
| Course_ID | Course | Student_ID | Enrollment_Date |
|------------|--------------|------------|------------------|
| 201 | Mathematics | 101 | 2023-09-01 |
| 202 | Biology | 101 | 2023-09-02 |
| 203 | Biology | 102 | 2023-09-01 |
By separating "Students" and "Courses" into two tables, we reduce redundancy and improve data integrity.
Note Summary
Common Mistakes
- Not normalising the data in a relational database: Without normalisation, data can become redundant and inefficient to manage.
- Confusing fields and records: Fields (columns) define the type of data, while records (rows) store individual instances of data.
- Using flat files for large datasets: Flat files do not scale well for large, complex data sets and can lead to data inconsistencies.
- Forgetting primary and foreign keys: Not using keys correctly in relational databases leads to disorganised data and prevents efficient querying.
Key Takeaways
- A database is an organised collection of data.
- Fields, records, and tables are the basic units in a database.
- Flat file databases are simple and good for small datasets but have limitations with data redundancy and integrity.
- Relational databases are organised into tables with relationships, reducing redundancy and improving data consistency.
- Use primary and foreign keys to relate data across tables in relational databases, ensuring data integrity and ease of maintenance.