Skip to main content
EnglandComputer ScienceSyllabus dot point

What are the primitive data types, and how are characters represented in binary?

Primitive data types (integer, real/float, Boolean, character, string) and how text is represented using character sets such as ASCII and Unicode, including the size and range implications of each type.

An OCR H446 answer on primitive data types and text representation: the integer, real, Boolean, character and string types and their storage, and how characters are represented in binary using character sets such as ASCII and Unicode, with their size and range implications.

Generated by Claude Opus 4.813 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. The answer
  3. Examples in context
  4. Try this

What this dot point is asking

OCR wants the primitive data types (integer, real/float, Boolean, character, string), when each is appropriate, and how text is represented in binary using character sets such as ASCII and Unicode, including their size and range implications. Expect a "choose the right type" question and an "ASCII versus Unicode" question.

The answer

The primitive data types

Representing characters: character sets

ASCII and Unicode

Examples in context

A database column's type (INT, REAL, CHAR, VARCHAR, BOOLEAN) is exactly this choice of primitive type, and it constrains what can be stored. Web pages declare a Unicode encoding (UTF-8) so they can display any language and emoji. A single bit storing a Boolean flag is the most efficient way to record a yes/no state. OCR links this to number and floating-point representation (how integers and reals are encoded) and to data structures, which are built from these primitives.

Try this

Q1. State the most appropriate primitive data type for a value that can only be true or false. [1 mark]

  • Cue. Boolean.

Q2. State how many characters standard 7-bit ASCII can represent. [1 mark]

  • Cue. 27=1282^7 = 128 characters.

Q3. Explain why Unicode was introduced in place of ASCII. [2 marks]

  • Cue. ASCII's 128 (or 256) characters cannot represent the many thousands of characters used across the world's languages; Unicode gives a unique code point to characters from virtually all writing systems.

Exam-style practice questions

Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

OCR 20195 marksExplain the difference between the ASCII and Unicode character sets, and explain why Unicode was introduced.
Show worked answer →

ASCII (up to 2): the standard ASCII character set uses 7 bits to represent 128 characters (extended ASCII uses 8 bits for 256), covering the English alphabet, digits, punctuation and control codes. It is compact but cannot represent the characters of most of the world's languages.

Unicode and why introduced (up to 3): Unicode uses more bits per character (for example 16 or more in its encodings) to provide a unique code point for characters from virtually all writing systems, including non-Latin alphabets, accented letters and emoji. It was introduced because ASCII's 128 (or 256) characters could not represent the many thousands of characters used worldwide, so a universal standard was needed for global text interchange. The trade-off is that Unicode text can take more storage per character. Markers reward the bit/character-count contrast and the global-language motivation.

OCR 20214 marksState the most appropriate primitive data type for each of the following and justify one choice: a person's age, whether a light is on, a temperature reading of 21.5 degrees, and a single initial letter.
Show worked answer →

Award marks for correct types (up to 3) plus one justification (up to 1).

Age: integer (a whole number, no fractional part). Light on/off: Boolean (one of two states, true or false). Temperature 21.5: real/float (has a fractional part). Single initial: character (one symbol).

Justification example: a Boolean is used for the light because it has exactly two possible states (on or off), which a single bit can represent, so it is the most storage-efficient and meaningful type. Markers reward the four correct types and a sensible justification for one. A common error is using a real for age or an integer for a value that needs decimals.

Related dot points

Sources & how we know this