Run-Length Encoding (RLE) (AQA GCSE Computer Science): Revision Notes
Run-length encoding (RLE)
What is run-length encoding?
Run-Length Encoding (RLE) is a straightforward compression technique that works by identifying sequences of repeated data and storing them more efficiently. Instead of storing the same character or value multiple times in a row, RLE stores the number of repetitions followed by the actual value.
This method is particularly effective when dealing with data that contains lots of consecutive repeated elements, such as simple images with large areas of the same colour or text files with repeated characters.
RLE is one of the oldest and simplest data compression methods, making it an excellent starting point for understanding how compression algorithms work. While modern compression techniques are more sophisticated, RLE's simplicity makes it valuable for specific applications where speed and ease of implementation are priorities.
How RLE works with text
Let's look at a practical example to understand how RLE compresses text data. The key principle is that RLE replaces repetitive sequences with a more compact representation.
Text compression example
Consider the text sequence: AAAABBBBBCCCCCCCC
Instead of storing each letter individually, RLE represents this as:
- 4 A's, 5 B's, 7 C's
- Or in a more compact format:
(4,A) (5,B) (7,C)
Worked Example: Basic RLE Compression
Original text: AAAABBBBBCCCCCCCC
- Count consecutive A's: 4
- Count consecutive B's: 5
- Count consecutive C's: 7
RLE representation: (4,A)(5,B)(7,C)
This transforms 17 individual characters into just 6 pieces of information (3 counts + 3 characters).
Converting to ASCII representation
When computers store this RLE data, they use ASCII codes. The example above becomes:
04 65 06 66 07 67
Here's what each number means:
04= 4 occurrences65= ASCII code for 'A'06= 6 occurrences66= ASCII code for 'B'07= 7 occurrences67= ASCII code for 'C'
This RLE sequence uses just 42 bits compared to the original which would need 119 bits if stored directly in ASCII.
Notice the significant space savings: RLE compressed this example to approximately 35% of the original size. However, this efficiency depends entirely on having long runs of repeated data.
RLE for image compression
RLE is particularly useful for compressing bitmap images, especially those with large areas of the same colour. This makes it ideal for simple graphics, logos, and images with solid colour regions.
Black and white bitmap example
Consider a simple black and white image represented as a row of pixels. If we have a pattern like:
- 4 black pixels, followed by 16 white pixels, followed by 12 black pixels
Using RLE, we can represent this efficiently by:
- Using 1 bit to identify the colour (1 for black, 0 for white)
- Using 7 bits to specify how many pixels of that colour
Worked Example: Image RLE Encoding
Pattern: 4 black, 16 white, 12 black pixels
RLE encoding:
- First group:
10000100(1 for black, 0000100 for 4 pixels) - Second group:
00010000(0 for white, 0010000 for 16 pixels) - Third group:
10001100(1 for black, 0001100 for 12 pixels)
Result: Three 8-bit numbers: 10000100 00010000 10001100
Comparing compression efficiency
For a 32-pixel image:
- Normal storage: 32 bits (1 bit per pixel)
- RLE storage: 24 bits (3 × 8-bit codes)
This represents a compression saving of 25%, but RLE only works well when there are long runs of repeated data.
When does RLE work well?
RLE achieves optimal compression when data exhibits specific characteristics:
Ideal Conditions for RLE:
- Data contains long sequences of repeated values
- Images have large areas of solid colour
- Text contains many repeated characters
- Simple graphics like logos, line drawings, or cartoon-style images
- Data with natural repetitive patterns
When RLE doesn't work well?
Understanding when RLE is ineffective is crucial for choosing the right compression method. RLE can actually make files larger when applied inappropriately.
When RLE Fails: RLE can actually increase file size when:
- Data is mostly random with few repetitions
- Images have lots of detail and colour variation
- The original data doesn't have long runs of the same value
- Photographic images with gradients and complex textures
In these cases, the overhead of storing both the count and the value for each short sequence can make the compressed file larger than the original.
Key advantages of RLE
RLE offers several benefits that make it valuable in specific applications:
- Simple to implement: The algorithm is straightforward to understand and code
- Fast compression and decompression: Minimal processing power required
- Lossless: No data is lost during compression
- Effective for suitable data: Can achieve significant compression ratios with repetitive data
- Real-time capability: Fast enough for real-time applications
- Low memory requirements: Uses minimal system resources
Key Points to Remember:
- RLE works by counting consecutive repeated data - it stores how many times something appears in a row, followed by what that something is
- It's most effective with repetitive data - images with solid colours or text with repeated characters compress well
- It can make files larger if used inappropriately - random data with no repetition will actually increase in size when RLE is applied
- The format is always count-then-value - remember that RLE stores the number of repetitions first, then the actual data value
- RLE is lossless compression - you can always get back exactly the same data you started with