Skip to main content
EnglandComputer ScienceSyllabus dot point

How do we make files smaller, and what is the trade-off?

Understand why data is compressed, and the difference between lossy and lossless compression including run-length encoding and Huffman coding.

A focused answer to AQA GCSE Computer Science 3.3.8, covering why data is compressed and the difference between lossy and lossless compression, including run-length encoding and Huffman coding.

Generated by Claude Opus 4.88 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Why compress data
  3. Lossy compression
  4. Lossless compression
  5. Run-length encoding
  6. Huffman coding
  7. Choosing lossy or lossless
  8. Try this

What this dot point is asking

AQA wants you to explain why data is compressed and to describe the difference between lossy and lossless compression, including the two named lossless methods, run-length encoding and Huffman coding.

Why compress data

Lossy compression

Lossy works by exploiting the limits of human perception: MP3 discards frequencies the ear can barely hear and sounds masked by louder ones; JPEG averages out fine colour variation the eye does not register. Each time a lossy file is re-saved, more detail is thrown away, so quality degrades cumulatively (generation loss).

Lossless compression

Run-length encoding

Huffman coding

Because no Huffman code is a prefix of another (the prefix property), the decoder can read a stream of bits and know exactly where each character ends without separators. This is why Huffman coding can shrink text losslessly: replacing fixed 8-bit ASCII with shorter codes for the common letters reduces the total bit count while still allowing exact reconstruction.

Choosing lossy or lossless

The choice between lossy and lossless depends entirely on whether losing data matters. For a text document, a spreadsheet or a program, every byte is significant, so only lossless will do; losing data would corrupt the file. For a photograph streamed on a website or music played from a phone, a small, imperceptible loss of detail is an acceptable price for a much smaller file, so lossy is chosen. The general principle is to use lossless when the data must be exact, and lossy when the file must be small and a little quality loss is tolerable.

Try this

Q1. State one reason files are compressed. [1 mark]

  • Cue. To use less storage space or to transfer faster over a network.

Q2. State one difference between lossy and lossless compression. [2 marks]

  • Cue. Lossy permanently removes some data so the original cannot be restored; lossless keeps all data so the original can be rebuilt exactly.

Exam-style practice questions

Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AQA 20204 marksA black-and-white image row is stored as the pixels WWWWWWBBBWWWW (W = white, B = black). Apply run-length encoding to this row, state the encoded form, and explain when RLE saves space and when it does not.
Show worked answer →

Group each run of identical pixels and record the value with its count. The row is six W, then three B, then four W, which encodes as 6W3B4W.

The original is 13 symbols; the encoded form uses three value-count pairs, so it is shorter. RLE saves space when data has long runs of repeated values (large blocks of one colour). It does not save space, and can make a file larger, when values rarely repeat (a noisy photograph), because every single pixel becomes a count of 1 plus its value.

Markers reward the correct encoding 6W3B4W, and a clear statement that RLE helps only with repetition.

AQA 20235 marksExplain the difference between lossy and lossless compression. For each, give one suitable use and justify why that type of compression is appropriate.
Show worked answer →

Lossy compression permanently removes some data, usually detail the eye or ear is unlikely to notice, achieving large size reductions but preventing perfect reconstruction. A suitable use is streaming music (MP3) or photographs (JPEG): the small loss in quality is acceptable because the files must be small enough to stream or share quickly.

Lossless compression reduces size with no loss of data, so the original is rebuilt exactly. A suitable use is a text document, spreadsheet or program file (ZIP), where losing even one character or byte would corrupt the file or change its meaning.

Markers reward a correct definition of each, a sensible matched use, and a justification that links the use to the trade-off (acceptable quality loss versus exact recovery).

Related dot points

Sources & how we know this