How does a hash table give near-instant lookup, and how are collisions handled?
Understand a hash table, the role of a hashing algorithm, how a key maps to an index, the meaning of a collision, and collision-resolution methods such as rehashing and chaining.
A focused answer to AQA A-Level Computer Science 4.2.6, covering hash tables, hashing algorithms, mapping a key to an index, collisions, load factor, and collision-resolution methods including rehashing and chaining.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
What this dot point is asking
AQA wants you to explain how a hash table stores and retrieves data using a hashing algorithm, define a collision, and describe collision-resolution methods such as rehashing (open addressing) and chaining.
How a hash table works
A good hash function is fast to compute, deterministic (the same key always gives the same index), distributes keys evenly across the table, and uses the whole table to minimise clustering. Poor functions that map many keys to the same region cause frequent collisions and destroy the speed advantage, so the quality of the hash function is central to performance.
Collisions
The load factor is the ratio of stored items to slots. A low load factor (a table much larger than the number of items) means few collisions and fast operations, but wastes memory; a high load factor saves memory but slows operations as more probing or longer chains are needed. Balancing this is why tables are resized.
Resolving collisions
When the load factor gets too high, the table is made larger and all items are re-inserted (resized and rehashed) to keep operations fast, because the indices depend on the table size and must be recomputed. Open addressing keeps everything in one array (good cache behaviour) but suffers clustering, where probed items pile up near a busy slot; chaining avoids clustering and can exceed the slot count, at the cost of extra memory and pointer following.
Hash tables are the standard way to implement the dictionary abstract data type, and the two topics are closely linked in the specification. The dictionary describes the behaviour (store and retrieve values by unique key); the hash table is the mechanism that makes that behaviour fast. This is also why hash tables suit any problem that needs near-instant membership testing or look-up: spell checkers store a dictionary of valid words and hash each typed word to check it in constant time, caches store recent results keyed by their inputs, and database indexes use hashing to jump straight to a record.
A limitation worth stating in an evaluation answer is that hashing destroys order. Because an item's position is determined by its hash value, the items are scattered across the array with no relationship to their keys' natural order, so a hash table cannot efficiently list its contents in sorted order or answer range queries (all keys between two bounds). When ordered output or range queries matter, a binary search tree is preferred even though its look-up is the slightly slower . Recognising this trade-off, fast unordered look-up versus slower ordered access, is exactly the kind of comparison examiners ask for.
Exam-style practice questions
Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AQA 20194 marksA hash table has 10 slots (indices 0 to 9). The hash function is index = key MOD 10. The keys 23, 45, 33 are inserted in that order, with collisions resolved by linear probing. State the slot each key is stored in and explain any probing that occurs.Show worked answer β
Key 23: , slot 3 is free, so 23 goes in slot 3.
Key 45: , slot 5 is free, so 45 goes in slot 5.
Key 33: , but slot 3 is occupied by 23, a collision. Linear probing checks the next slot, index 4, which is free, so 33 is stored in slot 4.
Markers reward computing each hash, identifying the collision for key 33, and resolving it by probing to the next free slot (index 4).
AQA 20214 marksExplain what is meant by a collision in a hash table, and compare rehashing (open addressing) with chaining as methods of resolving collisions.Show worked answer β
A collision occurs when two different keys are hashed to the same array index, so they cannot both occupy that slot directly.
Rehashing (open addressing) keeps all items in the array itself: when the target slot is taken, the algorithm probes for another slot (for example the next one, with linear probing) and stores the item there; retrieval probes the same way until the key or an empty slot is found. Chaining instead stores a pointer to a linked list at each index, and all items hashing there are appended to that list; retrieval hashes to the index then searches the short list.
A key difference is that open addressing can fill the array and suffers clustering, while chaining can hold more items than slots but uses extra memory for the lists.
Markers reward the definition of a collision and a correct contrast of the two resolution methods.
Related dot points
- Understand a dictionary as an abstract data type of key-value pairs, its operations, how it is typically implemented using a hash table, and when a dictionary is appropriate.
A focused answer to AQA A-Level Computer Science 4.2.7, covering the dictionary abstract data type of key-value pairs, its operations, its typical implementation using a hash table, and when to use a dictionary.
- Understand arrays (one, two and three dimensional), records and fields, and the difference between static and dynamic data structures.
A focused answer to AQA A-Level Computer Science 4.2.1, covering one, two and three dimensional arrays, records and fields, indexing, and the difference between static and dynamic data structures.
- Understand trees as a connected, undirected graph with no cycles, the terms root, child, parent, leaf and subtree, and the structure and use of a binary search tree.
A focused answer to AQA A-Level Computer Science 4.2.5, covering trees as connected acyclic graphs, the terminology root, child, parent and leaf, binary trees, and the structure and use of a binary search tree.
- Understand the linear search and binary search algorithms, how each works, their requirements, and their time complexity.
A focused answer to AQA A-Level Computer Science 4.3.2, covering the linear search and binary search algorithms, how each works, the requirement that binary search needs sorted data, and their time complexity.
- Understand graphs, vertices and edges, directed, undirected and weighted graphs, and how a graph is represented as an adjacency matrix or an adjacency list.
A focused answer to AQA A-Level Computer Science 4.2.4, covering graphs, vertices and edges, directed, undirected and weighted graphs, and the adjacency matrix and adjacency list representations with their trade-offs.
Sources & how we know this
- AQA A-level Computer Science (7517) specification β AQA (2015)