Representing Characters (OCR GCSE Computer Science): Revision Notes
Representing Characters
Every time you type a character on a keyboard, a specific binary code is transmitted to the computer. This binary code represents that character. Characters are stored in binary, and each character has a unique code in a system called a character set.
Character Sets
A character set is a collection of characters that can be represented by a computer system using binary codes. The two most common character sets are ASCII and Unicode.
| Character Set | Description |
|---|---|
| ASCII | ASCII uses 7 bits to represent characters, which allows for 128 unique characters (from 0 to 127). This includes standard English letters, numbers, and some symbols. |
| Extended ASCII | Extended ASCII uses 8 bits, allowing for 256 characters. This includes additional characters for symbols and other languages. |
| Unicode | Unicode uses 16 bits, providing 65,536 possible characters. It is used to represent a much wider range of characters, including those from different languages and alphabets. |
How Characters are Represented in Binary
- Characters are stored as binary codes, which are sequences of 1s and 0s.
- For example, in ASCII, the character 'A' is represented by the binary code 01000001.
The Relationship Between Bits and Character Sets
The number of bits used to represent each character determines how many characters can be stored.
- ASCII uses 7 bits, allowing for 128 characters.
- Extended ASCII uses 8 bits, allowing for 256 characters.
- Unicode uses 16 bits, allowing for 65,536 characters.
Problem with ASCII and the Use of Unicode
- ASCII can only represent 128 or 256 characters, which is not enough to represent all the different languages and symbols used worldwide.
- Unicode solves this problem by using 16 bits, allowing for a much larger range of characters, making it suitable for representing characters from all languages.
Binary Representation in Exams
- In GCSE exams, you will typically use 8-bit ASCII to represent characters in binary.
Logical Ordering of Characters
Character sets like ASCII and Unicode are logically ordered. This means that the binary code for 'B' will be one more than the binary code for 'A'.
- Example: 'A' in ASCII is 01000001, and 'B' is 01000010.
ASCII and Pure Binary
-
Numbers as characters have different binary codes compared to their pure binary representation. For example, the character '1' in ASCII is represented by the binary code 01100001, which is 48 in denary. In contrast, the number 1 in pure binary is represented as 00000001.
-
Arithmetic cannot be performed directly on ASCII characters that represent numbers. To perform calculations, these characters must first be converted into their pure binary form.
Key Points to Remember
- Characters are represented in binary using character sets like ASCII or Unicode.
- The number of bits determines how many characters can be represented.
- ASCII is limited to 128 or 256 characters, while Unicode can represent over 65,000 characters.