How are text characters and program instructions stored as binary?
Representing characters using ASCII, extended ASCII and Unicode, and the principle that program instructions and all real-world data are ultimately stored as binary.
An SQA Higher Computing Science answer on representing text and instructions in binary, covering ASCII, extended ASCII and Unicode character sets and the principle that all data and instructions are stored as binary.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this key area is asking
The SQA wants you to know how text characters are represented in binary using ASCII, extended ASCII and Unicode, and to understand the principle that all data and all program instructions are ultimately stored as binary. You should be able to compare the character sets and explain the trade-offs.
Everything is binary
This is the central idea of the key area. The letter 'A', the number 65 and a particular machine instruction can all be the same bit pattern; context decides what it means.
ASCII
For example, 'A' is code 65 () and 'a' is code 97. Because the codes for letters run in order, you can convert between upper and lower case by arithmetic on the codes - a handy property that follows from characters being stored as numbers.
Extended ASCII
Extended ASCII uses 8 bits (one byte) per character, giving 256 codes. The extra 128 codes add characters such as some accented letters, currency symbols and simple graphics characters. It is still far too small for the world's languages, but it fits common Western European text in a single byte each.
Unicode
The trade-off is storage: a Unicode character can take more bytes than an ASCII character, so a Unicode file uses more memory and bandwidth than the same text in ASCII. The first 128 Unicode codes match ASCII, so plain English text is compatible.
Instructions are binary too
Program instructions are stored as binary just like data. Each machine instruction is a binary pattern that the processor's control unit decodes and executes (for example "add the contents of two registers"). High-level program code is translated into these binary instructions before it can run, reinforcing the principle that the machine ultimately handles only binary.
Examples in context
This is why opening a file with the wrong encoding shows "mojibake" (garbled characters): the bytes are being interpreted with the wrong character set. UTF-8, a Unicode encoding, is now standard on the web precisely because one scheme can carry every language, while staying compatible with ASCII for plain English. The same "context decides meaning" idea explains how the same storage can hold a photo, a song or a program: only the software's interpretation differs.
Try this
Q1. State how many bits standard ASCII uses and how many codes that gives. [2 marks]
- Cue. 7 bits, giving 128 codes.
Q2. State one advantage of Unicode over ASCII. [1 mark]
- Cue. It can represent the characters of virtually every writing system (over a million codes), not just English.
Q3. Explain why a binary pattern in memory has no fixed meaning. [1 mark]
- Cue. Its meaning depends on how it is used - the same bits could be a number, a character or an instruction.
Exam-style practice questions
Practice questions written in the style of SQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
SQA Higher (style)4 marksExplain why Unicode is needed in addition to ASCII, and state one disadvantage of Unicode compared with ASCII.Show worked answer →
Standard ASCII uses 7 bits, giving only 128 codes, enough for the English alphabet, digits and basic punctuation but not for the many thousands of characters used by the world's languages (for example Chinese, Arabic, Greek) or for emoji and symbols.
Unicode uses many more bits per character (commonly up to 16 or 32 bits, or a variable-length encoding), giving over a million possible codes, so it can represent the characters of virtually every writing system in one consistent scheme.
One disadvantage: because each character can take more bits than ASCII, a Unicode text file uses more storage (and bandwidth) than the same text stored in ASCII.
Markers reward explaining that ASCII's limited codes cannot cover all the world's characters while Unicode can, and a valid disadvantage such as greater storage or memory use.
SQA Higher (style)3 marksA pupil claims a computer 'stores letters as letters and numbers as numbers'. Explain why this is wrong, using the idea of how characters and instructions are actually stored.Show worked answer →
The claim is wrong because a computer's memory can hold only binary (two-state) values, so everything is stored as patterns of 0s and 1s.
A character such as 'A' is stored as the binary code for its position in a character set: in ASCII, 'A' is code 65, stored as 1000001. The letter itself is not stored, only its numeric code, which software interprets as a character when displaying it.
Program instructions are also stored as binary: each machine instruction is a binary pattern that the processor decodes and executes. So both text and instructions are ultimately binary; whether a pattern means a character, a number or an instruction depends on how it is used.
Markers reward the point that memory holds only binary, that a character is stored as its binary code (for example ASCII 65 for 'A'), and that instructions are also stored as binary.
Related dot points
- Representing positive and negative integers using two's complement, and representing real numbers using floating-point with a mantissa and exponent.
An SQA Higher Computing Science answer on representing numbers in binary, covering positive and negative integers using two's complement and real numbers using floating-point representation with a mantissa and exponent.
- The structure of a computer: the processor (ALU, control unit and registers), the buses (data and address) used to read from and write to memory, and the difference between an interpreter and a compiler.
An SQA Higher Computing Science answer on computer structure, covering the processor (ALU, control unit and registers), the data and address buses used to access memory, and the difference between an interpreter and a compiler.
- The environmental impact of computer systems: their energy consumption, ways to reduce that impact, and the environmental considerations of intelligent systems.
An SQA Higher Computing Science answer on the environmental impact of computer systems, covering their energy consumption, practical ways to reduce that impact, and the environmental considerations raised by intelligent systems.
- Data types and structures: variables of simple types, 1-D arrays, records, and parallel arrays or arrays of records, with string operations.
An SQA Higher Computing Science answer on data types and structures, covering simple variable types, 1-D arrays, records, parallel arrays and arrays of records, plus string operations such as concatenation.