Skip to main content
WalesComputer ScienceSyllabus dot point

How is large-scale data organised in normalised relational databases, and how do big data and data warehousing extend this?

Describe the organisation and structure of data: relational databases, normalisation to third normal form, SQL, and big data and data warehousing.

A focused answer to WJEC A-Level Computer Science Unit 4 organisation of data, covering relational databases and keys, normalisation to first, second and third normal form, SQL, and big data and data warehousing.

Generated by Claude Opus 4.813 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. The answer
  3. Examples in context
  4. Try this

What this dot point is asking

WJEC wants you to describe how large-scale data is organised: relational databases and keys, normalisation to third normal form (1NF, 2NF, 3NF), SQL for querying and editing, and the modern extensions of big data and data warehousing. This deepens the AS database topic to the formal normal forms and the contemporary data landscape. Expect a normalisation-to-3NF question and a big-data question, both rewarding precise definitions.

The answer

Relational databases and keys

The relational model's strength is that one fact is stored in one place and relationships are made explicit through keys, which is what normalisation formalises.

Normalisation to third normal form

Reaching 3NF means every non-key field depends only on the key, the whole key and nothing but the key, which removes the insert, update and delete anomalies that redundancy causes.

SQL

SQL queries data with SELECT ... FROM ... WHERE, combines tables with JOIN, and edits data with INSERT, UPDATE and DELETE. Aggregate functions and clauses such as GROUP BY and ORDER BY summarise and sort results. SQL is declarative: you state the result wanted, not how to fetch it.

Big data and data warehousing

Big data is data too large, fast-changing or varied (volume, velocity, variety) for traditional tools, often requiring distribution across many machines. Data warehousing brings data together from many sources into one store optimised for analysis and reporting.

Examples in context

Example 1. A retailer reaching 3NF
A retailer's order table repeats the product name and category against every order line. Normalising to 3NF splits products into their own table keyed by ProductID, so each product's details are stored once and an order line just references the ProductID. Updating a product name then touches one row, not thousands, the practical payoff of 3NF.
Example 2. Sensor data as big data
A network of thousands of sensors streams readings every second, producing high-volume, high-velocity, semi-structured data. A single relational table cannot keep up, so the data is distributed and processed with big-data tools. This shows why big data is a distinct problem rather than just a bigger version of a normal database.
Example 3. A data warehouse for analysis
A company copies sales, stock and customer data nightly from its operational databases into a data warehouse structured for fast analytical queries. Reports run against the warehouse without slowing the live systems. This illustrates the purpose of data warehousing: consolidating data for analysis separately from day-to-day transaction processing.

Try this

Q1. State the condition that a table in third normal form must satisfy beyond being in second normal form. [1 mark]

  • Cue. No non-key field depends on another non-key field (no transitive dependency); every non-key field depends only on the primary key.

Q2. State two of the characteristics commonly used to define big data. [2 marks]

  • Cue. Any two of volume (huge size), velocity (high rate of arrival) and variety (many formats, including unstructured data).

Exam-style practice questions

Practice questions written in the style of WJEC exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

WJEC 20206 marksState the conditions a relational database table must meet to be in first, second and third normal form.
Show worked answer →

State each normal form's condition in order, each building on the previous.

First normal form (1NF): the table contains no repeating groups and each field holds a single (atomic) value, with a primary key identifying each record.

Second normal form (2NF): the table is in 1NF and every non-key field depends on the whole primary key, not just part of it (this removes partial dependencies, relevant where the key is composite).

Third normal form (3NF): the table is in 2NF and there are no non-key fields that depend on other non-key fields (no transitive dependencies); every non-key field depends only on the primary key.

Markers reward 1NF (atomic values, no repeating groups, a primary key), 2NF (1NF plus no partial dependency on part of a composite key), and 3NF (2NF plus no transitive dependency between non-key fields).

WJEC 20224 marksExplain what is meant by big data, and state two reasons traditional relational databases can struggle to handle it.
Show worked answer →

Define big data, then give two genuine challenges for relational databases.

Big data refers to data sets so large, fast-changing or varied that traditional tools struggle to store, process and analyse them. It is often characterised by volume (huge size), velocity (high rate of arrival) and variety (many formats, including unstructured data).

Reasons relational databases struggle: first, the sheer volume can exceed what a single relational system can store and query efficiently, so the data must be distributed across many machines. Second, much big data is unstructured or semi-structured (text, images, sensor streams) and does not fit neatly into fixed relational tables.

Markers reward a definition referencing volume, velocity or variety, and two valid challenges such as scale beyond a single system or unstructured data not fitting relational tables.

Related dot points

Sources & how we know this