Search Algorithms – Binary, Binary Tree, and Linear Search (AQA A-Level Computer Science): Revision Notes
Search Algorithms – Binary, Binary Tree, and Linear Search
Introduction to searching algorithms
One of the most powerful capabilities of computers is their ability to search through data quickly and efficiently. Searching is a fundamental operation that we encounter in countless everyday situations. For instance, when you use a cash machine to check your bank balance, the system searches through account records to find your specific account. Similarly, when you shop at a supermarket, the computerised till searches for product prices and details. Search engines on the internet help you find the cheapest holiday to the Algarve by searching through thousands of options.
Most searches are performed on data storage systems, though they're also used in other applications such as word processors that implement find and replace functions. A simple search might look for just one keyword, but more sophisticated search routines allow you to construct complex queries using logic statements such as OR, AND and NOT.
Real-World Applications of Search Algorithms
Search algorithms power many everyday computing tasks:
- Banking systems searching through millions of account records
- E-commerce websites finding products from vast inventories
- Search engines indexing billions of web pages
- Database systems retrieving specific records
- Word processors finding text in documents
There are several different searching algorithms available, each with its own strengths and weaknesses. The choice of which algorithm to use depends largely on the nature of the data being searched - particularly its size and how it's organised. The efficiency of algorithms is typically measured using Big O notation, which provides a mathematical way to compare how well different algorithms perform as dataset sizes grow.
Understanding these search methods will help you write more efficient programs and make informed decisions about which approach to use in different situations.
Linear search
What is a linear search?
A linear search is the simplest and most straightforward searching technique. It works by examining each item in a dataset one at a time, starting from the beginning and continuing sequentially until either the search term is found or the end of the data is reached.
Think of it like looking for a specific book on an unsorted bookshelf. If the books aren't organised in any particular order (not by title, not by author, not by any system), your only option is to look at each book one by one until you find the one you want. You might get lucky and find it immediately, or you might have to check every single book before finding it at the very end.
How linear search works
The process is beautifully simple:
- Start at the first item in the dataset
- Check if it matches what you're looking for
- If it matches, you've found it - stop searching
- If it doesn't match, move to the next item
- Repeat until you find the item or run out of items to check
Here's how this might look in pseudocode:
Repeat
Look at the Title
Until Title is the one you want OR there are no more books
This simplicity is both linear search's greatest strength and its greatest weakness. It's easy to implement and works on any dataset, regardless of how the data is organised. However, it's also the least efficient method, particularly when dealing with large amounts of data.
Efficiency considerations
The efficiency of a linear search is strongly influenced by where the item you're searching for is located within the dataset:
-
Best-case scenario: The item you want is at or near the beginning of the dataset. In this case, you'll find it very quickly with just one or a few comparisons.
-
Worst-case scenario: The item is at the end of the dataset, or doesn't exist in the dataset at all. In these cases, you'll need to check every single item, making it time-consuming for large datasets.
-
Average case: On average, you'll need to check approximately half of the items in the dataset before finding what you're looking for.
Performance Scales with Dataset Size
The speed of a linear search is directly proportionate to the size of the dataset. If you double the size of your dataset, you roughly double the time it takes to search through it. For computer scientists, we say that linear search has O(n) time complexity, where n represents the number of items in the dataset.
Implementation example
Here's how a linear search might be implemented in Visual Basic. This example searches through a table of data looking for a specific text string:
Private Sub btnSearch_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnSearch.Click
Dim CountLinear As Integer = 0
txtTraceBinary.Text = ""
txtTraceLinear.text = ""
' linear search
Do
CountLinear = CountLinear + 1
txtTraceLinear.Text = txtTraceLinear.Text & CountLinear & "-" & grdTable.Rows(CountLinear).Cells(0).Value & vbCrLf
Loop Until CountLinear = RowCount Or grdTable.Rows(CountLinear).Cells(0).Value = txtFind.Text
txtLinearLooks.Text = CountLinear
' match found?
If grdTable.Rows(CountLinear).Cells(0).Value = txtFind.Text Then
lblResult.Text = "Match Found"
Else
lblResult.Text = "No Match Found"
End If
End Sub
This code keeps a count of how many comparisons it makes (stored in CountLinear), demonstrating how the algorithm examines each row of data systematically until it either finds a match or reaches the end of the dataset.
When to use linear search
Linear search is most appropriate when:
- Your dataset is small
- The data is unsorted or randomly organised
- You need a simple solution that's quick to implement
- You'll only be searching occasionally
Despite its limitations, linear search remains useful because of its simplicity and the fact that it works on any type of data organisation.
Binary search
What is a binary search?
A binary search is a much more efficient searching technique, but it comes with an important requirement: the data must be sorted in order (either ascending or descending). It works by repeatedly dividing the search space in half, eliminating large portions of the dataset with each comparison.
Think of binary search like the classic number guessing game. If someone thinks of a number between 1 and 100, a logical strategy is to start by guessing 50. If they say "higher", you've immediately eliminated all numbers from 1 to 50 - that's half the possibilities gone with just one guess! You'd then guess 75 (halfway between 51 and 100), and continue this halving process until you find the number.
Critical Requirement: Sorted Data
Binary search only works when your data is arranged in a logical order. If your data is unsorted, you must either sort it first or use a different search method like linear search. This preprocessing requirement is the trade-off for the dramatically improved search speed.
This divide-and-conquer approach makes binary search dramatically faster than linear search for large datasets. However, remember that it only works when your data is arranged in a logical order.
How binary search works
The binary search algorithm uses three key positions or "pointers" to keep track of where it's searching:
- LowestPointer: Points to the first position in the current search range
- HighestPointer: Points to the last position in the current search range
- MiddlePointer: Points to the middle position, calculated as
Here's the step-by-step process:
- Set LowestPointer to the first item and HighestPointer to the last item
- Calculate MiddlePointer as the halfway point between them
- Compare the value at MiddlePointer with your search term
- If it matches, you've found it - stop searching
- If the search term is less than the middle value, set HighestPointer to MiddlePointer - 1 (search the lower half)
- If the search term is greater than the middle value, set LowestPointer to MiddlePointer + 1 (search the upper half)
- Repeat steps 2-6 until you find the item or LowestPointer equals HighestPointer
Worked example
Worked Example: Binary Search for Number 51
Let's search for the number 51 in a sorted dataset of 15 numbers arranged in ascending order. Here's our starting position:

Step 1: We start by looking at the middle position (cell 8). Halfway between position 1 and position 15 is position 8, which contains the number 37.

Since 51 is greater than 37, we know the number we're looking for must be in the upper half of the dataset. We can eliminate positions 1 through 8 (shown marked with 'x'). Our new search range is positions 9 to 15.
Step 2: Now we calculate the middle of our remaining range. Halfway between position 9 and 15 is position 12, which contains the number 57.

Since 51 is less than 57, we know to search in the lower portion of this range. We can now eliminate positions 12 through 15. Our search range is now positions 9 to 11.
Step 3: The middle position between 9 and 11 is position 10.

Position 10 contains the number 51 - we've found our target! This search took just 3 comparisons to find the number in a dataset of 15 items.
Pseudocode representation
Here's how binary search can be represented in pseudocode:
FindMe stores the record title that we are searching for
LowestPointer ← 1
HighestPointer ← NumberofRecords
Do
MiddlePointer ← (LowestPointer + HighestPointer) / 2
If Record at MiddlePointer < FindMe Then
LowestPointer ← MiddlePointer + 1
End If
If Record at MiddlePointer > FindMe Then
HighestPointer ← MiddlePointer - 1
End If
Until Record at MiddlePointer = FindMe OR LowestPointer = HighestPointer
The pointers track the boundaries of our search space, which shrinks by roughly half with each iteration. This is what makes binary search so efficient.
Efficiency and advantages
Binary search is remarkably efficient, especially as datasets grow larger. Here's why it's so powerful:
If you need to search through just three records, binary search will take a maximum of two comparisons. But the real advantage becomes apparent with larger datasets:
- For 1 million records: Maximum of just 20 comparisons needed
- For 6 billion records: Maximum of just 33 comparisons needed
Dramatic Efficiency Gains
Compare this to linear search, which would potentially need to check all 1 million or all 6 billion items in the worst case! This efficiency comes from the fact that binary search has O(log n) time complexity. Each comparison eliminates half of the remaining possibilities, leading to logarithmic growth in the number of comparisons needed as the dataset size increases.
At first glance, it might seem slow to divide and eliminate repeatedly, but in practice it's incredibly fast. The key insight is that you're discarding huge amounts of data with each comparison, unlike linear search which only eliminates one item at a time.
Implementation considerations
When implementing binary search, you need to:
- Ensure your data is sorted before searching (or maintain it in sorted order)
- Handle the case where the item isn't found (when LowestPointer equals or exceeds HighestPointer)
- Use integer division when calculating the midpoint to avoid fractional indices
- Be careful with index boundaries to avoid errors
Binary search is the algorithm of choice whenever you have sorted data and need to perform frequent searches, making it invaluable for database systems, search features in applications, and many other computing scenarios.
Binary tree search
What is a binary tree search?
A binary tree search is a specialised searching technique designed for data stored in binary tree structures. Binary trees are particularly useful in programs where data is dynamic - meaning information is constantly being added to and removed from the dataset.
Unlike searching through a simple list or array, a binary tree search involves traversing through a tree structure, making decisions at each node about whether to move left or right based on the value you're searching for.
How binary tree search works
When searching a binary tree, you start at the root node (the top of the tree) and work your way down by following branches. At each node, you compare the value you're looking for with the value stored at that node:
- If the search value is less than the current node's value, move to the left child node
- If the search value is greater than the current node's value, move to the right child node
- If the search value equals the current node's value, you've found it!
- If you reach a node with no children and haven't found the value, it doesn't exist in the tree
This process continues until you either find the value or reach a dead end (a node with no children in the direction you need to go).
Pseudocode representation
Here's the basic algorithm in pseudocode:
CurrentNode ← RootNode
Repeat
If Current_Node > Find_Me then
Move left to child node
Else
Move right to child node
End If
Until CurrentNode equals FindMe Or CurrentNode has no children
The variable FindMe contains the value being searched for, and CurrentNode tracks your current position as you traverse the tree.
Relationship to tree traversal
Connection to Tree Traversal
Binary tree search is very similar to in-order tree traversal. Both techniques involve systematically visiting nodes in a tree structure by making left/right decisions based on values. The key difference is that tree traversal visits every node, whereas tree search stops as soon as it finds the target value or determines it doesn't exist.
If you're familiar with tree traversal algorithms, you already understand the fundamental concept of navigating a binary tree - searching just adds the additional logic of comparing values and deciding which path to follow.
Implementation example
Here's how binary tree search might be implemented in Visual Basic:
Private Sub txtSearchFor_KeyDown(ByVal sender As Object, ByVal e As System.Windows.Forms.KeyEventArgs) Handles txtSearchFor.KeyDown
' check for enter key press
If e.KeyCode = Keys.Enter Then
Dim ThisNode As Integer = 1
txtOutput.Text = "Root - " & NodeValue(1) & vbCrLf
Do Until NodeValue(ThisNode) = txtSearchFor.Text Or ThisNode = 0
' move to node at left pointer
If txtSearchFor.Text < NodeValue(ThisNode) Then
ThisNode = PointerLeft(ThisNode)
txtOutput.Text = txtOutput.Text & "L - " & ThisNode & " - " & NodeValue(ThisNode) & vbCrLf
End If
' move to node at right pointer
If txtSearchFor.Text > NodeValue(ThisNode) Then
ThisNode = PointerRight(ThisNode)
txtOutput.Text = txtOutput.Text & "R - " & ThisNode & " - " & NodeValue(ThisNode) & vbCrLf
End If
Loop
If NodeValue(ThisNode) = txtSearchFor.Text Then
txtOutput.Text = txtOutput.Text & "FOUND"
Else
txtOutput.Text = txtOutput.Text & "NOT FOUND"
End If
End If
End Sub
This code demonstrates the key features of binary tree search:
- Starting at the root node
- Using pointer functions (PointerLeft and PointerRight) to navigate the tree
- Comparing values to decide which direction to move
- Continuing until the value is found or there are no more nodes to check
- Tracking the path taken through the tree (shown in the output)
When to use binary tree search
Binary tree search is most appropriate when:
- Your data is stored in a binary tree structure
- Data is frequently being added or removed (dynamic datasets)
- You need to maintain sorted order while allowing insertions and deletions
- You want search efficiency similar to binary search but need more flexibility
Binary trees combine the efficiency of binary search with the flexibility to easily add and remove data, making them excellent for applications like databases, file systems, and many other dynamic data scenarios.
Comparing search algorithms
Efficiency comparison
Different search algorithms have different time complexities, which affects how their performance scales with dataset size:
-
Linear search: O(n) complexity - time increases proportionally with dataset size. Simple but slow for large datasets.
-
Binary search: O(log n) complexity - extremely efficient for large sorted datasets. Requires data to be in order.
-
Binary tree search: O(log n) average case complexity - efficient and flexible, but performance can degrade if the tree becomes unbalanced.
Choosing the right algorithm
The selection of an appropriate search method depends primarily on two factors:
- How much data is being searched: Larger datasets benefit more from efficient algorithms like binary search
- How the data is organised: Sorted data allows binary search; dynamic data suits binary tree structures; unsorted or rarely-searched data might use linear search
Small vs Large Datasets
For small datasets (perhaps a few dozen items), the difference in performance is negligible, and linear search's simplicity might be preferable. However, as datasets grow into thousands or millions of items, choosing an efficient algorithm becomes crucial.
Practical considerations
When deciding which search algorithm to implement, also consider:
- Implementation complexity: Linear search is trivially simple; binary search requires sorted data; binary tree search needs a tree structure
- Data maintenance: Binary search needs data kept in order; binary trees need balancing for optimal performance
- Search frequency: If you search rarely, a simple approach might suffice; frequent searches justify more complex efficient methods
- Data volatility: Frequently changing data might benefit from tree structures over maintaining sorted arrays
Remember!
Key Points to Remember:
- Linear search checks each item sequentially from start to finish - simple but slow, works on any data organisation
- Binary search repeatedly halves the search space - very efficient (maximum 20 comparisons for 1 million items) but requires sorted data
- Binary tree search navigates through a tree structure by comparing values and moving left or right - efficient and flexible for dynamic data
- The choice of algorithm depends on dataset size, data organisation, and how frequently data changes
- Efficiency matters more as datasets grow larger - the difference between checking 1 million items versus 20 items is enormous!