What is a relational database and why do we normalise to third normal form?
Understand relational databases, primary and foreign keys, the problems of data redundancy, and normalisation to first, second and third normal form.
A focused answer to AQA A-Level Computer Science 4.10.2, covering relational databases, primary and foreign keys, the problems of data redundancy, and normalisation to first, second and third normal form.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
AQA wants you to describe a relational database with primary and foreign keys, explain the problems caused by data redundancy, and normalise data to first, second and third normal form.
Relational databases and keys
The foreign key is the mechanism that turns the conceptual relationships from the ER model into a working database. A one-to-many relationship is implemented by placing a foreign key on the "many" side pointing to the primary key on the "one" side, so each Order row carries the CustomerID of the customer who placed it. Following these key links is exactly what an SQL join does when it recombines the data, which is why keys and joins are so closely tied.
The problem of redundancy
The three anomalies are best understood through a single bad example: a sales table that stores the customer's address on every order line. Changing the address means editing every line (update anomaly), you cannot record a new customer until they place an order (insertion anomaly), and deleting their last order erases their address entirely (deletion anomaly). Splitting customer data into its own table fixes all three at once, which is the motivation for normalising.
Normalisation to third normal form
A common summary is that in 3NF every non-key attribute depends on "the key, the whole key, and nothing but the key". The three forms build on one another in order, and 2NF only bites when the primary key is composite (made of more than one field), because partial dependency means depending on part of that composite key. Normalisation reduces redundancy, removes anomalies and keeps data consistent, at the cost of needing joins to recombine the data when it is queried, which is a deliberate trade-off favouring integrity over query simplicity.
Exam-style practice questions
Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AQA 20196 marksA single table stores Orders with the fields OrderID, CustomerName, CustomerAddress and a list of ProductNames for each order. Explain why this table is not in first normal form, then describe how you would normalise it towards third normal form.Show worked answer →
The table is not in first normal form because the ProductNames field holds a repeating group (a list of products) rather than a single atomic value. First normal form requires each field to hold one value and the table to have a primary key.
To reach 1NF, remove the repeating group: create a separate row (or table) so each order-product combination is one record, giving an OrderLine entity with OrderID and ProductName, with a primary key.
To reach 2NF, remove partial dependencies: if a composite key is used, any attribute depending on only part of it (for example a product's price depending only on the product, not the whole order line) is moved to its own Product table.
To reach 3NF, remove transitive dependencies: CustomerName and CustomerAddress depend on the customer, not directly on the order, so they are moved to a Customer table, linked by a CustomerID foreign key in Order. Now every non-key attribute depends only on its table's primary key.
Markers reward identifying the repeating group (not 1NF), splitting it out, removing partial dependencies for 2NF, and moving customer details out to remove the transitive dependency for 3NF.
AQA 20214 marksExplain the problems caused by data redundancy in an unnormalised database and how normalisation addresses them.Show worked answer →
Data redundancy means the same information is stored repeatedly. This wastes storage and causes anomalies. An update anomaly arises because a value (such as a customer's address) stored in many rows must be changed in every one, and missing any leaves the data inconsistent. An insertion anomaly arises when you cannot add one piece of data without also supplying unrelated data. A deletion anomaly arises when removing a record accidentally loses other information that was only stored there.
Normalisation organises the data into related tables so each fact is stored once, removing the repetition and therefore the anomalies, with foreign keys linking the tables so the data can be recombined.
Markers reward naming and explaining the update, insertion and deletion anomalies and stating that normalisation stores each fact once to remove them.
Related dot points
- Understand conceptual data modelling, entities, attributes and relationships, entity relationship (ER) diagrams, and the degrees of relationship (one-to-one, one-to-many, many-to-many).
A focused answer to AQA A-Level Computer Science 4.10.1, covering conceptual data modelling, entities, attributes and relationships, entity relationship diagrams, and the degrees of relationship including resolving many-to-many.
- Understand SQL for retrieving data with SELECT, FROM, WHERE and ORDER BY, joining tables, and modifying data with INSERT, UPDATE and DELETE.
A focused answer to AQA A-Level Computer Science 4.10.3 and 4.10.4, covering SQL for retrieving data with SELECT, FROM, WHERE and ORDER BY, joining tables, and modifying data with INSERT, UPDATE and DELETE.
- Understand transaction processing, the ACID properties, record locking and the deadlock that can result, redundancy and database recovery.
A focused answer to AQA A-Level Computer Science 4.10.5, covering transaction processing, the ACID properties, record locking and deadlock, redundancy, and database recovery from failure.
- Understand arrays (one, two and three dimensional), records and fields, and the difference between static and dynamic data structures.
A focused answer to AQA A-Level Computer Science 4.2.1, covering one, two and three dimensional arrays, records and fields, indexing, and the difference between static and dynamic data structures.
- Understand the client-server and peer-to-peer models, web technologies including HTML, CSS and JavaScript, the role of web servers, and the use of APIs and thin versus thick clients.
A focused answer to AQA A-Level Computer Science 4.8.6, covering the client-server and peer-to-peer models, web technologies including HTML, CSS and JavaScript, the role of web servers, and thin versus thick clients.
Sources & how we know this
- AQA A-level Computer Science (7517) specification — AQA (2015)