What are the primitive data types, and how are characters represented in binary?
Primitive data types (integer, real/float, Boolean, character, string) and how text is represented using character sets such as ASCII and Unicode, including the size and range implications of each type.
An OCR H446 answer on primitive data types and text representation: the integer, real, Boolean, character and string types and their storage, and how characters are represented in binary using character sets such as ASCII and Unicode, with their size and range implications.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
OCR wants the primitive data types (integer, real/float, Boolean, character, string), when each is appropriate, and how text is represented in binary using character sets such as ASCII and Unicode, including their size and range implications. Expect a "choose the right type" question and an "ASCII versus Unicode" question.
The answer
The primitive data types
Representing characters: character sets
ASCII and Unicode
Examples in context
A database column's type (INT, REAL, CHAR, VARCHAR, BOOLEAN) is exactly this choice of primitive type, and it constrains what can be stored. Web pages declare a Unicode encoding (UTF-8) so they can display any language and emoji. A single bit storing a Boolean flag is the most efficient way to record a yes/no state. OCR links this to number and floating-point representation (how integers and reals are encoded) and to data structures, which are built from these primitives.
Try this
Q1. State the most appropriate primitive data type for a value that can only be true or false. [1 mark]
- Cue. Boolean.
Q2. State how many characters standard 7-bit ASCII can represent. [1 mark]
- Cue. characters.
Q3. Explain why Unicode was introduced in place of ASCII. [2 marks]
- Cue. ASCII's 128 (or 256) characters cannot represent the many thousands of characters used across the world's languages; Unicode gives a unique code point to characters from virtually all writing systems.
Exam-style practice questions
Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
OCR 20195 marksExplain the difference between the ASCII and Unicode character sets, and explain why Unicode was introduced.Show worked answer →
ASCII (up to 2): the standard ASCII character set uses 7 bits to represent 128 characters (extended ASCII uses 8 bits for 256), covering the English alphabet, digits, punctuation and control codes. It is compact but cannot represent the characters of most of the world's languages.
Unicode and why introduced (up to 3): Unicode uses more bits per character (for example 16 or more in its encodings) to provide a unique code point for characters from virtually all writing systems, including non-Latin alphabets, accented letters and emoji. It was introduced because ASCII's 128 (or 256) characters could not represent the many thousands of characters used worldwide, so a universal standard was needed for global text interchange. The trade-off is that Unicode text can take more storage per character. Markers reward the bit/character-count contrast and the global-language motivation.
OCR 20214 marksState the most appropriate primitive data type for each of the following and justify one choice: a person's age, whether a light is on, a temperature reading of 21.5 degrees, and a single initial letter.Show worked answer →
Award marks for correct types (up to 3) plus one justification (up to 1).
Age: integer (a whole number, no fractional part). Light on/off: Boolean (one of two states, true or false). Temperature 21.5: real/float (has a fractional part). Single initial: character (one symbol).
Justification example: a Boolean is used for the light because it has exactly two possible states (on or off), which a single bit can represent, so it is the most storage-efficient and meaningful type. Markers reward the four correct types and a sensible justification for one. A common error is using a real for age or an integer for a value that needs decimals.
Related dot points
- Data structures: arrays, records, tuples and lists, the stack and queue abstract data types and their operations, linked lists, trees and graphs, and hash tables, including how each is used and its advantages.
An OCR H446 answer on data structures: arrays, records, tuples and lists, the stack and queue abstract data types with their operations, linked lists, trees, graphs and hash tables, including how each is used and its advantages and disadvantages.
- Mathematical skills for computer science: set theory and set operations, the comparison of binary, denary and hexadecimal magnitudes, simple logic propositions, and the use of these tools to reason about data and algorithms.
An OCR H446 answer on the mathematical skills underpinning computer science: set theory and set operations, comparing magnitudes across binary, denary and hexadecimal, simple logic propositions, and applying these tools to reason about data and algorithms.
- Number systems: binary, denary and hexadecimal conversion, representing negative numbers with sign and magnitude and two's complement, binary addition and subtraction, fixed-point binary fractions, and the use of hexadecimal and bitwise masks.
An OCR H446 answer on number systems: converting between binary, denary and hexadecimal, representing negative numbers with sign and magnitude and two's complement, binary addition and subtraction, fixed-point binary fractions, and the use of hexadecimal and bitwise masks.
- Floating-point representation of real numbers using a mantissa and an exponent (both in two's complement), normalisation of a floating-point number, and the trade-off between range and precision.
An OCR H446 answer on floating-point representation: storing real numbers with a mantissa and an exponent in two's complement, how to normalise a floating-point number, and the trade-off between range and precision when bits are divided between mantissa and exponent.
- Programming techniques: sequence, selection and iteration, recursion, the use of subroutines (procedures and functions) with parameters passed by value and by reference, local and global variable scope, and the features of an integrated development environment (IDE).
An OCR H446 answer on programming techniques: sequence, selection and iteration, recursion, subroutines (procedures and functions) with parameters passed by value or by reference, local and global variable scope, and the features of an integrated development environment.