How is text stored in binary, and why was Unicode needed when ASCII already existed?
How text is represented using character sets, the ASCII character set, the need for and nature of Unicode, and how a character maps to a binary code.
An Eduqas GCSE Computer Science answer on how text is stored in binary using character sets, the 7-bit ASCII set, why Unicode was needed to support many languages, and how a character maps to a binary code.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Eduqas wants you to explain how text is stored in binary using a character set, describe ASCII, and explain why Unicode was needed when ASCII already existed. The marks come from the idea that each character has a unique binary code and from the contrast between ASCII's small range and Unicode's huge one.
Character sets
ASCII
Storing the alphabet in order has a useful side effect: you can find a letter's code from a known one. Since , the letter (four letters after ) is . The gap between a capital and its lowercase version is always (for example and ).
Unicode
Try this
Q1. State how many bits standard ASCII uses per character and how many characters that allows. [2 marks]
- Cue. bits, allowing characters.
Q2. The ASCII code for is . State the ASCII code for . [1 mark]
- Cue. (two after ).
Q3. Give one reason Unicode was created. [1 mark]
- Cue. ASCII could not represent the characters of other languages, so Unicode provides far more codes.
Exam-style practice questions
Practice questions written in the style of WJEC Eduqas exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Eduqas Component 1, 20224 marksExplain what is meant by a character set, and explain why Unicode is used in addition to ASCII.Show worked answer →
Character set (up to 2 marks): a character set is an agreed mapping that gives each character (letter, digit, symbol or control code) a unique binary code, so the same text is stored and read the same way on different systems.
Why Unicode (up to 2 marks): ASCII uses only 7 bits, giving 128 characters, which is enough for English letters, digits and basic symbols but not for the many thousands of characters in other languages and scripts. Unicode uses more bits, so it can represent characters from almost every writing system, plus emoji, while keeping the first 128 codes the same as ASCII.
Markers reward the "unique binary code per character" idea and the "ASCII has too few codes for other languages" point.
Eduqas Component 1, 20233 marksThe ASCII code for the capital letter A is 65. State the ASCII code for the capital letter D, the code for a lowercase a, and explain how you worked out the code for D.Show worked answer →
Capital D: the capital letters are stored in order, so D is three after A: 65 + 3 = 68 (1 mark plus 1 for the reasoning that letters are sequential).
Lowercase a: lowercase letters start at 97, so a is 97 (1 mark).
Explanation: because ASCII stores the alphabet in consecutive codes, you can add the letter's position difference to a known code, which is also why sorting by ASCII code puts letters in alphabetical order.
Related dot points
- Number systems: binary, denary and hexadecimal, and how to convert between all three, including why hexadecimal is used as a shorthand for binary.
An Eduqas GCSE Computer Science answer on the binary, denary and hexadecimal number systems and how to convert between all three, with worked place-value methods and the reason hexadecimal is a useful shorthand for binary.
- Binary addition of two 8-bit numbers including carrying and overflow, and binary shifts (left and right) and their effect of multiplying or dividing by powers of two.
An Eduqas GCSE Computer Science answer on adding two 8-bit binary numbers (with carrying and overflow) and performing left and right binary shifts, including how a shift multiplies or divides by powers of two.
- Signed and unsigned binary, and the use of two's complement to represent negative integers, including converting between two's complement and denary for 8-bit numbers.
An Eduqas GCSE Computer Science answer on signed versus unsigned binary and how two's complement represents negative integers, with worked conversions between 8-bit two's complement and denary in both directions.
- The units of data (bit, nibble, byte, kilobyte, megabyte, gigabyte, terabyte), how images are stored as pixels (resolution and colour depth), how sound is sampled (sample rate and bit depth), and calculating file sizes.
An Eduqas GCSE Computer Science answer on the units of data, how images (resolution and colour depth) and sound (sample rate and bit depth) are represented in binary, and full worked calculations of image and sound file sizes.
- String handling (length, indexing, substrings, concatenation, case conversion), input validation (presence, range, length, type and format checks), and how programs are written and tested in the Component 2 on-screen exam.
An Eduqas GCSE Computer Science answer on string handling (length, indexing, substrings, concatenation, case conversion), input validation (presence, range, length, type, format), and how programs are written, tested and refined in the Component 2 on-screen exam.
Sources & how we know this
- WJEC Eduqas GCSE Computer Science specification (from 2016) — Eduqas (2020)