Data Structures (OCR A-Level Computer Science): Revision Notes
📚 Revision Notes
Hash Tables
Overview
A hash table is a data structure that stores key-value pairs. It provides fast access to data using a process called hashing, which maps keys to indices in an underlying array. Hash tables are widely used for implementing associative arrays, databases, and caches due to their average time complexity of O(1) for insertion, deletion, and search operations.
Structure of a Hash Table
- Keys: Unique identifiers for data entries.
- Values: Data associated with each key.
- Hash Function: Converts a key into an array index.
- Buckets: Array slots where key-value pairs are stored.
- Collisions: Occur when two keys map to the same index.
Collision Resolution Techniques
- Separate Chaining: Store multiple key-value pairs in the same bucket using a linked list.
- Open Addressing: Find another bucket within the array for the colliding key. Common methods include:
- Linear Probing: Check the next available bucket.
- Quadratic Probing: Use a quadratic function to find the next bucket.
- Double Hashing: Use a second hash function to calculate the next bucket.
Creating a Hash Table
Using Arrays (Procedural Approach)
lightbulbExample
Example: Hash table with separate chaining.
# Simple hash table using arrays and linked lists for separate chaining
SIZE = 10
hash_table = [[] for _ in range(SIZE)]
def hash_function(key):
return key % SIZE
def insert(hash_table, key, value):
index = hash_function(key)
for pair in hash_table[index]:
if pair[0] == key:
pair[1] = value # Update value if key exists
return
hash_table[index].append([key, value]) # Insert new key-value pair
def search(hash_table, key):
index = hash_function(key)
for pair in hash_table[index]:
if pair[0] == key:
return pair[1] # Return value if key is found
return None # Key not found
def delete(hash_table, key):
index = hash_function(key)
for i, pair in enumerate(hash_table[index]):
if pair[0] == key:
del hash_table[index][i] # Remove key-value pair
return
Using Object-Oriented Programming (OOP)
lightbulbExample
Example: A hash table class using separate chaining.
class HashTable:
def __init__(self, size=10):
self.size = size
self.table = [[] for _ in range(size)]
def hash_function(self, key):
return hash(key) % self.size
def insert(self, key, value):
index = self.hash_function(key)
for pair in self.table[index]:
if pair[0] == key:
pair[1] = value # Update value if key exists
return
self.table[index].append([key, value]) # Insert new key-value pair
def search(self, key):
index = self.hash_function(key)
for pair in self.table[index]:
if pair[0] == key:
return pair[1]
return None
def delete(self, key):
index = self.hash_function(key)
for i, pair in enumerate(self.table[index]):
if pair[0] == key:
del self.table[index][i]
return
Traversing a Hash Table
While hash tables are not typically used for sequential access, traversal can be done by iterating through all buckets.
lightbulbExample
Example:
def traverse(hash_table):
for i, bucket in enumerate(hash_table):
if bucket:
print(f"Bucket {i}: {bucket}")
Examples
infoNote
Inserting and Searching in a Hash Table:
hash_table = HashTable()
hash_table.insert("name", "Alice")
hash_table.insert("age", 25)
print(hash_table.search("name")) # Output: Alice
print(hash_table.search("age")) # Output: 25
infoNote
Deleting an Entry:
hash_table.delete("name")
print(hash_table.search("name")) # Output: None
infoNote
Traversing the Hash Table:
traverse(hash_table.table)
Note Summary
infoNote
Common Mistakes
- Ignoring Collisions:
- Without a collision resolution strategy, data can be lost or overwritten.
- Poor Hash Functions:
- A poorly designed hash function can lead to many collisions, degrading performance to O(n).
- Not Handling Dynamic Resizing:
- Hash tables can become inefficient if not resized when the load factor (entries/size) increases.
- Key-Value Confusion:
- Ensure that keys are unique and values are correctly associated with them.
infoNote
Key Takeaways
- Hash Tables store key-value pairs for efficient data retrieval.
- They rely on hash functions to map keys to indices.
- Collision resolution techniques include separate chaining and open addressing.
- Practice reading, tracing, and writing hash table implementations to build a strong understanding of their operation.