How does normalisation remove the redundancy and update anomalies from a flat table?
Normalise data from unnormalised form to first, second and third normal form, using functional dependency and primary, foreign and composite keys to remove redundancy and anomalies.
A focused answer to the SQA Advanced Higher Computing Science content on normalisation, covering unnormalised form, first, second and third normal form, functional dependency, and how normalisation removes redundancy and update anomalies.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
The SQA wants you to normalise data: take an unnormalised collection of attributes and work it through first, second and third normal form, using functional dependencies and keys, so that the final tables are free of redundancy and update anomalies. You must be able to state the rule for each form and apply it to a worked example.
Why normalise
A single flat table that crams everything together suffers from anomalies. Data redundancy stores the same fact repeatedly; an update anomaly arises when a duplicated fact is changed in some copies but not others, leaving the data inconsistent; an insertion anomaly stops you recording one fact without another being present; a deletion anomaly loses a fact you wanted to keep when you delete a row for another reason. Normalisation cures all of these by splitting the data into well-chosen tables where each fact lives in exactly one place.
First normal form (1NF)
A table is in 1NF when it has no repeating groups, every attribute holds a single atomic value, and there is a primary key identifying each row. You reach 1NF by separating any repeating group into its own rows. For example, an order that listed several products in one record is split so each product appears in its own row, identified by a composite key of order and product.
Second normal form (2NF)
A table is in 2NF when it is in 1NF and has no partial-key dependencies: every non-key attribute depends on the whole primary key, not just part of a composite key. This step only does work when the key is composite. Attributes that depend on only one part of the key are removed to a table keyed by that part.
Third normal form (3NF)
A table is in 3NF when it is in 2NF and has no transitive dependencies: no non-key attribute depends on another non-key attribute. Such an attribute is moved to a table keyed by the attribute it really depends on. 3NF is the target for Advanced Higher: at 3NF every non-key attribute depends on the key, the whole key, and nothing but the key.
Try this
Q1. State the rule that a table must satisfy to be in third normal form, beyond being in 2NF. [1 mark]
- Cue. No transitive dependencies: no non-key attribute depends on another non-key attribute.
Q2. Name the anomaly where changing a duplicated fact in some rows but not others leaves the data inconsistent. [1 mark]
- Cue. An update anomaly.
Exam-style practice questions
Practice questions written in the style of SQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AH style: define 2NF3 marksState the conditions a table must satisfy to be in second normal form (2NF).Show worked answer →
The table must already be in first normal form: no repeating groups and a primary key identifying each row (1 mark). It must have no partial-key dependencies: every non-key attribute depends on the whole primary key, not on just part of a composite key (1 mark). So 2NF only does work beyond 1NF when the primary key is composite; attributes depending on only one part of that key are moved to their own table (1 mark). Markers want 1NF as a precondition plus the removal of partial dependencies.
AH style: why normalise4 marksExplain two problems that normalisation removes from an unnormalised table, giving an example of each.Show worked answer →
Data redundancy: the same fact is stored many times, wasting space, for example a customer's address repeated on every one of their orders (1 mark for problem, 1 for example). Update anomaly: because a fact is duplicated, changing it in one place but not all leaves the data inconsistent, for example updating a price on one order row but not the others (1 mark for problem, 1 for example). Other valid answers are insertion and deletion anomalies. Normalisation removes these by storing each fact once in the correct table.
Related dot points
- Analyse data requirements and design a relational database using entities, attributes and relationships, entity-relationship diagrams, a data dictionary, and primary, foreign and composite keys.
A focused answer to the SQA Advanced Higher Computing Science content on database analysis and design, covering entities, attributes and relationships, entity-relationship diagrams, the data dictionary, and primary, foreign and composite keys.
- Implement and query a relational database in SQL, using data definition (CREATE, ALTER) and data manipulation (SELECT, INSERT, UPDATE, DELETE) with joins, aggregate functions, GROUP BY and HAVING, subqueries, wildcards and computed columns.
A focused answer to the SQA Advanced Higher Computing Science content on SQL, covering data definition and data manipulation, multi-table joins, aggregate functions with GROUP BY and HAVING, subqueries, wildcards and computed columns.
- Test SQL queries against expected results, maintain referential integrity, and evaluate a database solution for fitness for purpose and data integrity.
A focused answer to the SQA Advanced Higher Computing Science content on database testing and evaluation, covering testing SQL queries against expected results, referential integrity, entity integrity and evaluating a database for fitness for purpose.
- Describe and implement the data structures used in Advanced Higher: one- and two-dimensional arrays, records, sequential files, linked lists, stacks and queues, including their operations.
A focused answer to the SQA Advanced Higher Computing Science content on data structures, covering one- and two-dimensional arrays, records, sequential files, linked lists, stacks and queues with their core operations.