Skip to main content
EnglandComputer ScienceSyllabus dot point

How do we shrink data, keep it secret, and verify it without storing the original?

Compression (lossy and lossless, run length encoding and dictionary coding), encryption (symmetric and asymmetric) and hashing, including their characteristics, differences and appropriate uses.

An OCR H446 answer on compression, encryption and hashing: lossy versus lossless compression with run length encoding and dictionary coding, symmetric versus asymmetric encryption, and how hashing works, with the characteristics, differences and appropriate uses of each.

Generated by Claude Opus 4.813 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. The answer
  3. Examples in context
  4. Try this

What this dot point is asking

OCR wants the difference between lossy and lossless compression with the techniques of run length encoding and dictionary coding, the difference between symmetric and asymmetric encryption, and how hashing works, all with appropriate uses. Expect "explain the difference" and "justify a use" questions.

The answer

Lossy and lossless compression

Symmetric and asymmetric encryption

Hashing

Examples in context

HTTPS uses asymmetric encryption to agree a symmetric session key, then symmetric encryption for the fast bulk transfer. Streaming services use lossy codecs to fit video down a limited connection, while a software download uses lossless ZIP so the program is byte-perfect. Websites store salted password hashes so a stolen database does not reveal the passwords. Download pages publish a file's hash so you can verify the copy you received was not tampered with. OCR links this to network security and to databases (hashing for indexing).

Try this

Q1. State one type of file for which lossless compression is essential and explain why. [2 marks]

  • Cue. A text document or program file: losing any data would corrupt the content or stop the program running, so the exact original must be recoverable.

Q2. Explain the key distribution problem that symmetric encryption faces. [2 marks]

  • Cue. Both parties need the same secret key, but sharing it over an insecure channel risks interception; asymmetric encryption solves this by using a freely shareable public key.

Q3. State why a website stores a hash of a password rather than the password itself. [1 mark]

  • Cue. Hashing is one-way, so a stolen database does not directly reveal the passwords; logins are checked by comparing hashes.

Exam-style practice questions

Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

OCR 20196 marksExplain the difference between lossy and lossless compression, and state an appropriate use for each, justifying your choices.
Show worked answer →

Award up to three marks for each compression type with its use.

Lossless compression reduces file size while allowing the exact original to be reconstructed; no data is permanently discarded (techniques include run length encoding and dictionary coding). Appropriate use: text documents, program code, spreadsheets or a ZIP archive, where losing any data would corrupt the file.

Lossy compression reduces file size by permanently removing data the human eye or ear is unlikely to notice, so the original cannot be perfectly restored, but the ratio is much higher. Appropriate use: streaming photos, music or video (JPEG, MP3), where a small quality loss is acceptable in exchange for a much smaller file.

Markers reward the reversible-versus-irreversible distinction plus a justified use for each (exactness needed -> lossless; bandwidth/size matters and small quality loss acceptable -> lossy).

OCR 20216 marksA website stores user passwords. Explain why the passwords should be hashed rather than encrypted, and explain how hashing differs from encryption.
Show worked answer →

Hashing versus encryption (up to 4): encryption is a reversible two-way process, ciphertext can be decrypted back to plaintext with the key, whereas hashing is a one-way process that maps input to a fixed-length digest and cannot (feasibly) be reversed. A good hash function is deterministic (same input gives same hash), produces a fixed-length output and makes collisions extremely unlikely.

Why hash passwords (up to 2): the system never needs the original password back; it only needs to check a login by hashing the entered password and comparing digests. Storing hashes means that if the database is stolen, the actual passwords are not directly exposed, whereas reversible encryption could be decrypted if the key were also compromised. Markers reward one-way versus two-way plus the security argument for not needing the original.

Related dot points

Sources & how we know this