How do we shrink data, keep it secret, and verify it without storing the original?
Compression (lossy and lossless, run length encoding and dictionary coding), encryption (symmetric and asymmetric) and hashing, including their characteristics, differences and appropriate uses.
An OCR H446 answer on compression, encryption and hashing: lossy versus lossless compression with run length encoding and dictionary coding, symmetric versus asymmetric encryption, and how hashing works, with the characteristics, differences and appropriate uses of each.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
OCR wants the difference between lossy and lossless compression with the techniques of run length encoding and dictionary coding, the difference between symmetric and asymmetric encryption, and how hashing works, all with appropriate uses. Expect "explain the difference" and "justify a use" questions.
The answer
Lossy and lossless compression
Symmetric and asymmetric encryption
Hashing
Examples in context
HTTPS uses asymmetric encryption to agree a symmetric session key, then symmetric encryption for the fast bulk transfer. Streaming services use lossy codecs to fit video down a limited connection, while a software download uses lossless ZIP so the program is byte-perfect. Websites store salted password hashes so a stolen database does not reveal the passwords. Download pages publish a file's hash so you can verify the copy you received was not tampered with. OCR links this to network security and to databases (hashing for indexing).
Try this
Q1. State one type of file for which lossless compression is essential and explain why. [2 marks]
- Cue. A text document or program file: losing any data would corrupt the content or stop the program running, so the exact original must be recoverable.
Q2. Explain the key distribution problem that symmetric encryption faces. [2 marks]
- Cue. Both parties need the same secret key, but sharing it over an insecure channel risks interception; asymmetric encryption solves this by using a freely shareable public key.
Q3. State why a website stores a hash of a password rather than the password itself. [1 mark]
- Cue. Hashing is one-way, so a stolen database does not directly reveal the passwords; logins are checked by comparing hashes.
Exam-style practice questions
Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
OCR 20196 marksExplain the difference between lossy and lossless compression, and state an appropriate use for each, justifying your choices.Show worked answer →
Award up to three marks for each compression type with its use.
Lossless compression reduces file size while allowing the exact original to be reconstructed; no data is permanently discarded (techniques include run length encoding and dictionary coding). Appropriate use: text documents, program code, spreadsheets or a ZIP archive, where losing any data would corrupt the file.
Lossy compression reduces file size by permanently removing data the human eye or ear is unlikely to notice, so the original cannot be perfectly restored, but the ratio is much higher. Appropriate use: streaming photos, music or video (JPEG, MP3), where a small quality loss is acceptable in exchange for a much smaller file.
Markers reward the reversible-versus-irreversible distinction plus a justified use for each (exactness needed -> lossless; bandwidth/size matters and small quality loss acceptable -> lossy).
OCR 20216 marksA website stores user passwords. Explain why the passwords should be hashed rather than encrypted, and explain how hashing differs from encryption.Show worked answer →
Hashing versus encryption (up to 4): encryption is a reversible two-way process, ciphertext can be decrypted back to plaintext with the key, whereas hashing is a one-way process that maps input to a fixed-length digest and cannot (feasibly) be reversed. A good hash function is deterministic (same input gives same hash), produces a fixed-length output and makes collisions extremely unlikely.
Why hash passwords (up to 2): the system never needs the original password back; it only needs to check a login by hashing the entered password and comparing digests. Storing hashes means that if the database is stolen, the actual passwords are not directly exposed, whereas reversible encryption could be decrypted if the key were also compromised. Markers reward one-way versus two-way plus the security argument for not needing the original.
Related dot points
- The relational database model: entities, attributes, primary and foreign keys, entity relationships, the difference between a flat file and a relational database, and normalisation to first, second and third normal form.
An OCR H446 answer on relational databases and normalisation: entities, attributes, primary and foreign keys, entity relationships, the difference between flat-file and relational databases, and how to normalise data to first, second and third normal form.
- Transaction processing and the ACID properties (atomicity, consistency, isolation, durability), record locking and serialisation to manage concurrent access, and redundancy through commitment ordering and backups.
An OCR H446 answer on transaction processing and managing concurrent access: the ACID properties (atomicity, consistency, isolation, durability), record locking and serialisation to prevent conflicting updates, and redundancy through commitment ordering and backups.
- Networks (LAN and WAN, network topologies, client-server and peer-to-peer), network hardware (NIC, switch, router, WAP), the need for protocols and protocol layering, the TCP/IP four-layer stack, and packet switching.
An OCR H446 answer on networks: LANs and WANs, topologies, client-server and peer-to-peer models, network hardware (NIC, switch, router, wireless access point), the need for protocols and layering, the TCP/IP four-layer stack, and packet switching.
- The structure of the internet, the Domain Name System (DNS), URLs and IP and MAC addressing, the difference between the internet and the world wide web, and the protocols HTTP, HTTPS, FTP and the client-server model of the web.
An OCR H446 answer on the structure of the internet: the Domain Name System, URLs, IP and MAC addressing, the distinction between the internet and the world wide web, and the protocols HTTP, HTTPS and FTP within the client-server model of the web.
- The legislation relevant to computing: the Data Protection Act, the Computer Misuse Act, the Copyright, Designs and Patents Act and the Regulation of Investigatory Powers Act, and the principles of copyright and software licensing including open source and proprietary models.
An OCR H446 answer on the legislation relevant to computing: the Data Protection Act, Computer Misuse Act, Copyright, Designs and Patents Act and Regulation of Investigatory Powers Act, and the principles of copyright and software licensing including open source and proprietary models.