Referential Integrity (OCR A-Level Computer Science): Revision Notes
Referential Integrity
Overview
Referential Integrity is a key concept in relational databases that ensures relationships between tables remain consistent. It means that a foreign key in one table must always refer to a valid, existing primary key in another table. Enforcing referential integrity prevents "orphaned records" (records with invalid references) and maintains the reliability and consistency of data across related tables.
What is Referential Integrity?
Referential integrity ensures that any foreign key value in a table must either be null or match a primary key value in the related table. This constraint enforces meaningful connections between tables, keeping data linked and accurate.
Example: If you have a Students table and an Enrolments table, each enrolment record must reference a valid student ID from the Students table. If a Student_ID in the Enrolments table doesn't match any existing Student_ID in the Students table, referential integrity is violated.
How Referential Integrity Works
Referential integrity is typically enforced through foreign key constraints. A foreign key constraint links a column (or combination of columns) in one table to the primary key in another table, ensuring that each reference is valid. This constraint is often applied when defining relationships in a relational database.
When a foreign key constraint is in place, the database system will:
- Prevent Insertion: Block any attempt to add a foreign key value that does not match an existing primary key.
- Prevent Deletion: Block the deletion of a referenced primary key record, unless specified otherwise (e.g., with cascading delete).
- Prevent Updates: Block changes to primary key values if they would break the link to related foreign key values unless specified otherwise (e.g., with cascading updates).
Enforcing Referential Integrity: Example
Consider a database with two related tables: Students and Enrolments. Students Table: Contains information about students, where Student_ID is the primary key.
| Student_ID | Name | Age |
|------------|------------|-----|
| 101 | Alice Smith| 20 |
| 102 | Bob Jones | 19 |
Enrollments Table: Stores course enrolments, where Student_ID is a foreign key that references Student_ID in the Students table.
| Enrollment_ID | Student_ID | Course |
|---------------|------------|----------|
| 1 | 101 | Maths |
| 2 | 102 | Biology |
With referential integrity:
- You cannot add an enrolment with a Student_ID of 103 because 103 does not exist in the Students table.
- You cannot delete the student with Student_ID 101 from the Students table unless you first delete their enrolments or apply a cascading delete rule.
Why Referential Integrity is Desirable
Prevents Orphaned Records:
- Without referential integrity, you could have records in the Enrolments table that reference nonexistent Student_IDs, leading to incomplete or misleading data.
- Referential integrity ensures that every record in a related table has a valid reference in the primary table.
Maintains Data Consistency:
- Referential integrity enforces consistent relationships, making sure data remains accurate and related across tables.
- It keeps data logically organised, allowing for meaningful relationships that make sense within the context of the database.
Improves Data Reliability:
- Enforcing referential integrity ensures that database queries return accurate results by preventing invalid references.
- This improves the reliability of reports and analytics, as data connections are consistent and trustworthy.
Supports Cascading Operations:
- Referential integrity can include rules for cascading updates or deletions, which automatically update or delete related records in child tables, streamlining database management.
Note Summary
Common Mistakes
- Ignoring Foreign Key Constraints: Failing to define foreign keys can lead to orphaned records, where related tables do not have valid references.
- Not Setting Cascade Rules Appropriately: Incorrect cascading rules (e.g., deleting parent records without cascading deletes) can lead to broken references or unintended data loss.
- Assuming Referential Integrity Enforces Data Accuracy: Referential integrity only ensures valid links between tables. Data entry errors, like typos, are still possible and should be handled separately.
Key Takeaways
- Referential Integrity ensures that foreign keys reference valid primary keys, maintaining consistent links between tables.
- Enforcing referential integrity prevents orphaned records and maintains data reliability and consistency.
- Referential integrity is achieved through foreign key constraints, and cascading rules can further enhance data integrity by automating related changes.
- Applying referential integrity helps improve database accuracy, making it essential for high-quality data management in relational databases.