WalesComputer ScienceSyllabus dot point

Why is data compressed, and how do lossy and lossless compression, including run-length encoding, work?

The need for compression, the difference between lossy and lossless compression, and how run-length encoding compresses data.

A focused answer to the WJEC GCSE Computer Science Unit 1 content on compression, covering why data is compressed, the difference between lossy and lossless compression and when each is used, and how run-length encoding (RLE) compresses repeated data with a worked example.

Generated by Claude Opus 4.88 min answerUpdated 2026-06-15

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this topic is asking
Why compress data
Lossy compression
Lossless compression
Run-length encoding
Choosing a method
Try this

What this topic is asking

WJEC wants you to know why data is compressed, the difference between lossy and lossless compression and when each is appropriate, and how run-length encoding works. This is part of the Data representation and data types content in Unit 1 of WJEC GCSE Computer Science (3500).

Why compress data

Lossy compression

Lossless compression

Run-length encoding

Compressing a row of pixels with run-length encoding

A row of $15$ pixels is stored as RRRRRRGGGGBBBBB (six red, four green, five blue). Show the run-length encoded version and compare the lengths.

step 1: Identify the runs in order

Reading left to right, there are three runs: six R, then four G, then five B.

step 2: Write each run as a value and a count

Encode each run as a pair: $(6,\text{R})$ , $(4,\text{G})$ , $(5,\text{B})$ , or written compactly as $6\text{R}\,4\text{G}\,5\text{B}$ .

step 3: Count the stored items

The original stores $15$ separate pixel values. The encoded version stores three value-and-count pairs, which is six items instead of fifteen.

step 4: State why it compresses

Because the data has long runs of identical pixels, storing each value once with a count is much shorter than listing every pixel, and no data is lost, so it is lossless.

Choosing a method

Use lossy compression when smaller files matter more than perfect fidelity and a slight quality drop is acceptable (streaming media). Use lossless compression when the data must be recovered exactly (documents, code, archives). RLE is only worth using when the data has many repeated runs.

Try this

Q1. State one reason why files are compressed before being sent over the internet. [1 mark]

Cue. Smaller files use less bandwidth and transmit faster.

Q2. Compress the sequence AAAABBAAAA using run-length encoding. [2 marks]

Cue. $4\text{A}\,2\text{B}\,4\text{A}$ (or pairs $(4,\text{A})(2,\text{B})(4,\text{A})$ ).

Exam-style practice questions

Practice questions written in the style of WJEC exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

WJEC-style Unit 14 marksExplain the difference between lossy and lossless compression, giving one example of when each would be used.

Show worked answer →

A Unit 1 compare question. Lossy compression makes a file smaller by permanently removing some data, usually detail that people are less likely to notice, so the original cannot be perfectly recovered (1 mark); it is used for things like streaming music or photos where a small loss of quality is acceptable in return for much smaller files (1 mark). Lossless compression makes a file smaller without losing any data, so the original can be reconstructed exactly (1 mark); it is used where every bit matters, such as compressing a text document, a spreadsheet or program code (1 mark). Markers reward the permanent loss versus exact recovery distinction and a sensible example of each. A common error is to say lossless makes files smaller by deleting data, which describes lossy.

WJEC-style Unit 13 marksA row of pixels is stored as WWWWWBBBBBBWWWW. Show how run-length encoding would compress this row and state why this method saves space here.

Show worked answer →

A Unit 1 run-length encoding question. Run-length encoding records each run as the value and the number of times it repeats. The row is five W, then six B, then four W, so it compresses to $5\text{W}\,6\text{B}\,4\text{W}$ (or pairs such as $(5,\text{W})(6,\text{B})(4,\text{W})$ ) (2 marks for the correct encoding). It saves space because the data contains long runs of identical values, so storing the value once with a count is shorter than listing every value (1 mark). Markers reward the value-and-count pairs and the reason about repeated runs. A common error is to record each character separately, which does not compress, or to lose the order of the runs.

Related dot points

Sources & how we know this

WJEC GCSE Computer Science specification (3500) from 2017 — WJEC (2017)