Encryption (AQA A-Level Computer Science): Revision Notes
Encryption
Introduction to encryption
When data is transmitted across networks or stored on computer systems, there is always a risk that unauthorised individuals could intercept or access it. This is particularly concerning when the data contains sensitive information such as banking details, medical records, or government security information.

Encryption is a method of transforming readable information into a scrambled format that cannot be understood without knowing how to decrypt it. The reverse process, decryption, converts the scrambled data back into its original readable form. Together, these processes ensure that data remains secure during transmission and storage.
Encryption refers to the process of converting plaintext (data in a format that people can understand directly) into ciphertext (scrambled data that appears meaningless). This transformation uses an algorithm and typically requires a key.
Decryption is the reverse operation - it converts ciphertext back into plaintext so the authorised recipient can read the original message.
Here are some real-world scenarios where encryption is essential:
- Online banking transactions must be encrypted to prevent theft and fraud when account details and payment information travel across the internet
- Health information systems need encryption because they contain highly sensitive personal and medical data
- Government security systems rely on encryption as breaches could have serious implications for national security
The use of encryption has deep roots in military and government security. Many encryption techniques were originally developed during wartime conflicts. With the widespread adoption of the internet in recent years, encryption has become a vital mechanism for protecting data transmitted across both local and wide area networks.
Encryption basics
All encryption systems work by converting plaintext into ciphertext. Let's examine how this process works in practice.
Plaintext is the original data in a format that can be understood immediately. For instance, the message "BROADSWORD CALLING DANNY BOY" is plaintext because anyone reading it can comprehend its meaning straight away.
Ciphertext is data that has undergone encryption. After applying an encryption algorithm to our example, it becomes "DTQCFVYQTF ECNNKPI FCPPA DQA" - a meaningless string of characters to anyone who doesn't know how to decrypt it.
The encryption process follows these steps:
- The message is written in plaintext format
- An encryption algorithm transforms it into ciphertext using a specific method and key
- The encrypted message is transmitted or stored
- The recipient receives the ciphertext
- The recipient decrypts the message back to plaintext using their knowledge of the algorithm and key
A crucial point to understand is that encryption doesn't prevent data from being intercepted - it simply makes intercepted data meaningless to anyone who doesn't possess the key and knowledge of the encryption method used.
Keys play a vital role in encryption. A key is a piece of data used during the encryption and decryption process that determines exactly how the plaintext is transformed. Keys can range from simple (like a number indicating how many positions to shift letters) to complex (like a long random string of characters).
Throughout history, people have attempted to crack encrypted codes - to work out what algorithm has been used and discover the key. Famous examples include the code-breakers who worked on the German Enigma machine during World War II. Polish and British mathematicians successfully cracked this code, and many historians believe this achievement significantly contributed to the Allied victory by allowing them to intercept and read German war communications.
The Caesar cipher
Named after Julius Caesar, the Roman Emperor who used it for his personal correspondence, the Caesar cipher represents one of the earliest and simplest forms of encryption. It is an example of a substitution cipher - a method where each character in the plaintext is replaced with a different character to create the ciphertext.
Simple shift cipher
The most basic form of Caesar cipher works by shifting each letter of the alphabet by a fixed number of positions. Here's an example with a two-letter shift:
Worked Example: Two-Letter Shift Cipher
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B |
In this example, the letter A in plaintext becomes C in ciphertext, B becomes D, and so on. Using this two-letter shift, our message "BROADSWORD CALLING DANNY BOY" becomes "DTQCFVYQTF ECNNKPI FCPPA DQA".
Random substitution
A variation on this method uses random substitution rather than a simple shift. Each letter could map to any other letter without following a pattern:
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| O | Z | P | C | Y | D | B | Q | X | K | L | A | V | W | R | I | S | M | J | E | G | N | F | T | U | H |
With this random substitution, "BROADSWORD CALLING DANNY BOY" becomes "ZMROCJFRMC POAAXWB COWWU ZAU".
Keyword-based substitution
Another method adds complexity by incorporating a keyword. Consider using the keyword "BESWAX". First, remove any repeated letters from the keyword (leaving "BESWAX"), then place these letters at the start of the alphabet substitution, followed by the remaining letters in alphabetical order:
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B | E | S | W | A | X | C | D | F | G | H | I | J | K | L | M | N | O | P | Q | R | T | U | V | Y | Z |
Why Caesar ciphers are vulnerable
Caesar ciphers, regardless of which variation is used, are relatively easy to crack. This is true even without knowing the key. Simple shift ciphers can be worked out by trying all possible shifts (there are only 25 possibilities in the English alphabet). Random substitution and keyword-based methods can be defeated using frequency analysis, which we'll explore next.
To decrypt a message, the receiver needs to know which key is being used. For a two-letter shift, they simply shift back by two positions. For a keyword-based cipher, they need to know the keyword. For random substitution, they would need the entire look-up table showing which letter maps to which.
Polyalphabetic ciphers
A more sophisticated approach involves using multiple substitution alphabets - this is called a polyalphabetic cipher. Rather than using just one set of substitutions, the encryption uses several different alphabets in sequence.
For instance, you might use four different alphabets in rotation:
- The first letter of plaintext is encrypted using alphabet 1, producing 'W'
- The second letter is encrypted using alphabet 2, producing 'S'
- The third letter is encrypted using alphabet 3, producing 'P'
- The fourth letter is encrypted using alphabet 4, producing 'D'
All four alphabets would be needed to decrypt the message. This concept dates back to the 15th century and is known as the Alberti cipher. The same principle formed the basis of the Enigma machine, which used several randomised alphabets in rotation.
Frequency analysis
The Caesar cipher is particularly vulnerable to an attack method called frequency analysis. This technique exploits the fact that in any language, certain letters appear more frequently than others.
In English writing, some letters occur much more often than others. Certain combinations of letters also appear regularly. By examining the frequency of letters in a piece of ciphertext and comparing them to known patterns in English, it becomes possible to work out which plaintext letter has been substituted for which ciphertext letter.

The chart above shows the typical frequency distribution of letters in English text. Notice that 'e' is by far the most common letter, appearing approximately 12.7% of the time. The letter 't' is the second most frequent, whilst letters like 'j', 'q', 'x', and 'z' are rarely used.
Using frequency analysis to crack a cipher
Worked Example: Cracking a Caesar Cipher with Frequency Analysis
To crack a Caesar cipher using frequency analysis, follow this process:
- Take a large sample of the ciphertext
- Count how many times each letter appears
- Calculate the relative frequency of each letter
- Compare the frequencies to the known pattern for English
For example, if you find that the letter 'p' appears most frequently in the ciphertext, you can reasonably assume that p = e (since 'e' is the most common letter in English plaintext). By matching other high-frequency letters, you can gradually work out the entire substitution pattern.
With modern computing power, frequency analysis can be performed almost instantaneously. This is why more sophisticated encryption methods have been developed to protect sensitive data.
Transposition ciphers
Unlike substitution ciphers which replace letters with different letters, transposition ciphers rearrange the letters of the message to form an anagram. The letters themselves don't change - only their positions. To decrypt the message, you must rearrange the letters according to a specific pattern defined by the key.
Railfence cipher
The railfence cipher is a type of transposition cipher where the message is split across several lines and then read off in a different order.
Worked Example: Two-Line Railfence Cipher
Take the message "BROADSWORD CALLING" and write it in a zigzag pattern across two rows:
| B | O | D | W | R | C | L | I | G | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R | A | S | O | D | A | L | N |
Now read the message line by line: "BODWRCLIG" followed by "RASODALN", giving the ciphertext "BODWRCLIGRASODALN".
To decrypt this message, you would need to know that the key indicates a two-line split. You would then reverse the process by writing the letters back into two rows and reading them in zigzag order.
You can increase complexity by using more lines. Here's the same message split across three lines:
| B | D | R | L | G | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R | A | S | O | C | A | N | |||||||||||
| O | W | D | I |
Reading line by line produces: "BDRLG" + "RASON" + "OWDI", giving the ciphertext "BDRLGRASONOWD I".
In this case, the message becomes "BDRLGRASODALN OWCI".
Route cipher
A route cipher is another type of transposition cipher that arranges the message in a grid and then reads it off using a specific route pattern.
Worked Example: Route Cipher
Place the message "BROADSWORD CALLING" in a 6 × 3 grid:
| B | W | L |
|---|---|---|
| R | O | L |
| O | R | I |
| A | D | N |
| D | C | G |
| S | A | A |
Now read down the columns from left to right to decrypt the message. The ciphertext becomes "BROA DDSWOR ACLLINGA" (with spacing added for clarity).
You can add null characters (meaningless padding) to fill empty cells in your grid. In this example, the letter 'A' has been added to the bottom right cell to complete the grid.
Combined encryption methods
Transposition ciphers become significantly more secure when combined with substitution ciphers. This creates a two-stage encryption process that requires two different keys to crack.

The diagram above shows how the message "BROADSWORD CALLING" might be encrypted using:
- First, a 6 × 3 transposition cipher produces "BWLR OLOR IADNDCGSAA"
- Then, a substitution cipher using the keyword "HEISENBERG" produces the final ciphertext "EWFPLFLOPASKDSIRQHH"
Although more complex than a simple substitution cipher, this combined approach can still be cracked without knowing the key by using a combination of frequency analysis and anagram-solving techniques.
Vernam cipher
Gilbert Vernam invented this cipher approximately 100 years ago as a method of securing data whilst it was being transmitted using telex machines. The Vernam cipher represents a class of encryption techniques known as one-time pad methods.
A one-time pad is a key that should be as long as the plaintext message being encoded. This key is used once to encrypt a message and then destroyed. The critical security feature is that the key must never be used again for encryption, although it will be used once more to decrypt the received ciphertext.
How the Vernam cipher works
The encryption process combines each character in the plaintext with the corresponding character in the key using a binary operation:
Worked Example: Vernam Cipher Encryption
Step 1: Generate the key
Create a completely random sequence of characters that is as long as the plaintext message. For maximum security, each key should only be used once.
To encrypt "BROADSWORD", you might generate the random key "HELKKJVTU I". Here's how the plaintext aligns with the key:
| Plaintext message | B | R | O | A | D | S | W | O | R | D |
|---|---|---|---|---|---|---|---|---|---|---|
| Key | H | E | L | K | K | J | V | T | U | I |
Step 2: Convert to binary
Each character in both the plaintext and the key is converted into a binary code. Originally, the Vernam cipher used Baudot code, a five-digit binary character code that predates ASCII and Unicode. Modern implementations typically use ASCII or Unicode instead.
For instance, using Baudot code:
- B = 11001
- H = 10100
Step 3: Apply XOR operation
A logical XOR (exclusive OR) operation is performed on the binary representations. XOR is a bitwise operation that produces a 1 only if the two bits being compared are different:
| Baudot code for plaintext B | 1 | 1 | 0 | 0 | 1 |
|---|---|---|---|---|---|
| Baudot code for one-time pad H | 1 | 0 | 1 | 0 | 0 |
| XOR to produce ciphertext | 0 | 1 | 1 | 0 | 1 |
The Baudot table then converts the result (01101) back to a character, which is F. Therefore, the first character of the ciphertext is F. This process repeats for each letter in the plaintext.
Step 4: Decrypt the message
When the receiver obtains the ciphertext, they can decrypt it by performing an XOR operation between the ciphertext and the key. Because XOR is reversible, this recovers the original plaintext:

The table above shows how ciphertext F (01101) XORed with key H (10100) produces the original plaintext B (11001).
Why the Vernam cipher is mathematically secure
Once the entire message has been encrypted and decrypted, the key is destroyed and a new random key is generated for the next message. As long as the key is completely random, kept secret, and only used once, it becomes mathematically impossible to crack the code.
This is because the key creates entirely random outcomes. If you don't know the key, ciphertext letter 'A' could equally represent plaintext 'H', 'L', or any other letter. Even if the letter 'A' appears multiple times in the ciphertext, each occurrence will likely correspond to a different plaintext letter. An attacker intercepting the message would need to try every possible key to work out every possible plaintext - an impossibly large number of combinations with no way to identify the correct one.
This property is known as perfect security - no amount of time or ciphertext can enable a cryptographer to crack the code.
Practical challenges
Whilst theoretically unbreakable, achieving complete security with a Vernam cipher is difficult in practice:
- Generating truly random numbers is complex because any algorithm used will likely contain some element of predictability
- Key exchange presents a security challenge - letting the receiver know the key is difficult because this information itself could be intercepted
- Authentication is impossible - there is no way to verify the sender's identity, so if the key was intercepted, messages could be sent from unauthorised sources
Modern systems typically use ASCII or Unicode to convert characters to binary values rather than the original Baudot code.
Computational security
The Vernam cipher with a true one-time pad is the only cipher considered to be 100% mathematically secure. All other ciphers can theoretically be cracked given enough time and computing resources. This brings us to the concept of computational security or computational hardness.
A cipher that is computationally secure is one that is theoretically breakable but not with current technology in a timeframe that would be useful. This recognises the reality that whilst most encryption can theoretically be cracked, in practice it will be secure enough to withstand most threats.
Balancing security with practicality
When designing encryption algorithms, programmers need to understand that some levels of encryption are harder to crack than others. The level of security used should be commensurate with the level of risk associated with the data being protected.
For instance, military data concerning troop movements would warrant a much more sophisticated level of encryption than a file used to store a school project. The more sensitive the data, the stronger the encryption required.
Methods for cracking encryption
Computational security means that cryptographers must be aware of the various ways their encryption could be attacked. Beyond frequency analysis, several other cracking methods exist:
-
Identifying commonly used techniques: Many ciphers are based on well-known methods like substitution or transposition. Experienced cryptographers can recognise patterns in data that has been encrypted using these standard methods, giving them a starting point for cracking the code
-
Reverse engineering: This involves working backwards step by step until you understand how something has been constructed. By systematically analysing the ciphertext and trying different approaches, attackers can sometimes deduce the encryption method
-
Dictionary attacks: This method uses a dictionary containing common words and phrases. After each attempt to decrypt the text, the result is compared to dictionary entries to see if it matches. If a match is found, the decryption key has been discovered
-
Brute force: This approach is similar to dictionary attacks but more comprehensive and time-consuming. Rather than just checking common words and phrases, brute force looks at every single possible permutation of characters that could be created. The decrypted text is then compared to these permutations to find a match
Modern computing power means that simple encryption methods can often be cracked very quickly, which is why contemporary systems use complex algorithms that would take an impractical amount of time to break through brute force methods.
Remember!
Key Points to Remember:
-
Encryption transforms plaintext into ciphertext to protect data during transmission and storage. Only those with the correct key and knowledge of the algorithm can decrypt the message back to plaintext.
-
Caesar ciphers use substitution but are vulnerable to frequency analysis because English has predictable letter patterns. The letter 'e' appears most frequently, allowing attackers to match ciphertext patterns to plaintext patterns.
-
Transposition ciphers rearrange letters rather than substituting them. Methods like railfence and route ciphers create anagrams that can be more secure, especially when combined with substitution techniques.
-
The Vernam cipher is mathematically unbreakable when used correctly with a truly random one-time pad. However, practical challenges with key generation, distribution, and authentication make it difficult to achieve perfect security in real-world applications.
-
Computational security recognises practical limits - most encryption methods are theoretically breakable but secure enough in practice because cracking them would require more time and computing resources than attackers have available. Security levels should match the sensitivity of the data being protected.