EnglandComputer ScienceSyllabus dot point

How do we shrink data, keep it secret, and verify it without storing the original?

Compression (lossy and lossless, run length encoding and dictionary coding), encryption (symmetric and asymmetric) and hashing, including their characteristics, differences and appropriate uses.

An OCR H446 answer on compression, encryption and hashing: lossy versus lossless compression with run length encoding and dictionary coding, symmetric versus asymmetric encryption, and how hashing works, with the characteristics, differences and appropriate uses of each.

Generated by Claude Opus 4.813 min answerUpdated 2026-06-02

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Quick answer

Compression reduces file size for storage and transmission. Lossless compression lets the exact original be reconstructed, using techniques such as run length encoding (storing runs of identical values as a value and a count) and dictionary coding (replacing repeated sequences with shorter codes from a dictionary); it suits text and program files. Lossy compression permanently discards data the eye or ear is unlikely to notice, giving much smaller files at some quality loss; it suits images, audio and video. Encryption keeps data secret. Symmetric encryption uses one shared key for both encrypting and decrypting (fast, but the key must be shared securely). Asymmetric encryption uses a public key to encrypt and a matching private key to decrypt (solves key distribution, slower). Hashing is a one-way function that turns input into a fixed-length digest that cannot feasibly be reversed; it is used for passwords and integrity checks.

Jump to a section

What this dot point is asking
The answer
Examples in context
Try this

What this dot point is asking

OCR wants the difference between lossy and lossless compression with the techniques of run length encoding and dictionary coding, the difference between symmetric and asymmetric encryption, and how hashing works, all with appropriate uses. Expect "explain the difference" and "justify a use" questions.

The answer

Lossy and lossless compression

Symmetric and asymmetric encryption

Hashing

Run length encode a row of pixels

A monochrome image row is W W W W W B B W W W W W W where W is white and B is black. Apply run length encoding and find the compression, assuming each symbol and each count takes one unit.

step 1: Identify the runs

Reading left to right: five W, then two B, then six W.

step 2: Encode as value-count pairs

Five W -> 5W, two B -> 2B, six W -> 6W. The encoded form is 5W 2B 6W.

step 3: Compare sizes

The original has 13 symbols. The encoded form has 3 pairs = 6 units (three values plus three counts).

step 4: State the result

Run length encoding reduced 13 units to 6, roughly a 54 percent reduction, and it is lossless because the exact original row can be rebuilt from the pairs. RLE works here because of the long identical runs; a row alternating W B W B would not compress.

Examples in context

HTTPS uses asymmetric encryption to agree a symmetric session key, then symmetric encryption for the fast bulk transfer. Streaming services use lossy codecs to fit video down a limited connection, while a software download uses lossless ZIP so the program is byte-perfect. Websites store salted password hashes so a stolen database does not reveal the passwords. Download pages publish a file's hash so you can verify the copy you received was not tampered with. OCR links this to network security and to databases (hashing for indexing).

Try this

Q1. State one type of file for which lossless compression is essential and explain why. [2 marks]

Cue. A text document or program file: losing any data would corrupt the content or stop the program running, so the exact original must be recoverable.

Q2. Explain the key distribution problem that symmetric encryption faces. [2 marks]

Cue. Both parties need the same secret key, but sharing it over an insecure channel risks interception; asymmetric encryption solves this by using a freely shareable public key.

Q3. State why a website stores a hash of a password rather than the password itself. [1 mark]

Cue. Hashing is one-way, so a stolen database does not directly reveal the passwords; logins are checked by comparing hashes.

Exam-style practice questions

Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

OCR 20196 marksExplain the difference between lossy and lossless compression, and state an appropriate use for each, justifying your choices.

Show worked answer →

Award up to three marks for each compression type with its use.

Lossless compression reduces file size while allowing the exact original to be reconstructed; no data is permanently discarded (techniques include run length encoding and dictionary coding). Appropriate use: text documents, program code, spreadsheets or a ZIP archive, where losing any data would corrupt the file.

Lossy compression reduces file size by permanently removing data the human eye or ear is unlikely to notice, so the original cannot be perfectly restored, but the ratio is much higher. Appropriate use: streaming photos, music or video (JPEG, MP3), where a small quality loss is acceptable in exchange for a much smaller file.

Markers reward the reversible-versus-irreversible distinction plus a justified use for each (exactness needed -> lossless; bandwidth/size matters and small quality loss acceptable -> lossy).

OCR 20216 marksA website stores user passwords. Explain why the passwords should be hashed rather than encrypted, and explain how hashing differs from encryption.

Show worked answer →

Hashing versus encryption (up to 4): encryption is a reversible two-way process, ciphertext can be decrypted back to plaintext with the key, whereas hashing is a one-way process that maps input to a fixed-length digest and cannot (feasibly) be reversed. A good hash function is deterministic (same input gives same hash), produces a fixed-length output and makes collisions extremely unlikely.

Why hash passwords (up to 2): the system never needs the original password back; it only needs to check a login by hashing the entered password and comparing digests. Storing hashes means that if the database is stolen, the actual passwords are not directly exposed, whereas reversible encryption could be decrypted if the key were also compromised. Markers reward one-way versus two-way plus the security argument for not needing the original.

Related dot points

Sources & how we know this

OCR AS and A Level Computer Science (H046, H446) specification — OCR (2015)