EnglandComputer ScienceSyllabus dot point

What are regular expressions, and how do regular and context-free languages differ?

Understand regular expressions and regular languages, the link between regular expressions and finite state machines, the limits of regular languages, and context-free languages described by a BNF grammar.

A focused answer to AQA A-Level Computer Science 4.4.3 and 4.4.4, covering regular expressions and regular languages, their link to finite state machines, the limits of regular languages, and context-free languages described using BNF.

Generated by Claude Opus 4.89 min answerUpdated 2026-06-02

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this dot point is asking
Regular expressions and regular languages
Link to finite state machines
Limits, and context-free languages

What this dot point is asking

AQA wants you to write and read regular expressions, link them to finite state machines, recognise what regular languages cannot describe, and read a context-free grammar written in Backus-Naur Form (BNF).

Regular expressions and regular languages

For example, (0|1)+ describes any non-empty binary string, and ab*c describes an a, then any number of bs, then a c. Building these up from the three operators is a common exam skill: alternation gives a choice, concatenation puts pieces in sequence, and the repetition operators control how many times a piece may appear. Reading an unfamiliar expression is the same process in reverse, identifying which operator binds where.

Link to finite state machines

This equivalence ties the topic to program translation. In the lexical analysis stage of compilation, the tokens of a language (identifiers, numbers, operators) are each described by a regular expression, and the lexer that recognises them is essentially a finite state machine. Because the two formalisms are interchangeable, a language designer can specify tokens with readable regular expressions and a tool can automatically build the equivalent finite state machine that does the recognising.

Limits, and context-free languages

BNF defines a grammar with rules of the form non-terminal then ::= then a definition, where <...> denotes a non-terminal and | separates alternatives. A rule can refer to itself (recursion), which is what gives BNF the power to describe nesting.

<digit>  ::= 0|1|2|3|4|5|6|7|8|9
<number> ::= <digit> | <digit><number>

The recursion in <number> (defined partly in terms of itself) is exactly what lets it generate numbers of any length, and the same recursion lets a grammar describe arbitrarily deep nesting such as brackets within brackets. This is why the syntax analysis stage of a compiler uses a context-free grammar rather than regular expressions: program structure is full of nesting (blocks inside blocks, expressions inside expressions) that regular languages cannot capture, which mirrors the FSM-versus-pushdown step in the computation hierarchy.

Build a regular expression and read a BNF rule

Target: write a regular expression for a simple identifier, then interpret a recursive BNF rule.

step 1: Define the identifier informally

Suppose an identifier is a single letter (a to z) followed by zero or more letters or digits.

step 2: Write the regular expression

Using alternation for the character sets and the star for repetition: a letter (a|b|...|z) then zero or more letters-or-digits (a|...|z|0|...|9)*. This is a regular language, recognisable by a finite state machine.

step 3: Read a recursive BNF rule

Consider <list> ::= <item> | <item>,<list>. The first alternative is a single item; the second is an item, a comma, then another list (the rule refers to itself).

step 4: Interpret the recursion

The recursion means <list> can be one item, or two items separated by a comma, or three, and so on without limit. This unbounded, self-referential structure is something a regular expression cannot express, which is why a context-free grammar in BNF is needed for it.

Exam-style practice questions

Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AQA 20194 marksWrite a regular expression that matches strings consisting of one or more letter a followed by exactly one letter b, then explain the link between regular expressions and finite state machines.

Show worked answer →

A suitable regular expression is a+b (one or more a, then a single b). Equivalently aa*b expresses the same set.

Regular expressions and finite state machines have equal power: any language that can be described by a regular expression can be recognised by a finite state machine, and any finite state machine can be written as an equivalent regular expression. The set of languages they both describe is exactly the regular languages. This equivalence is why regular expressions are used to define the tokens that a lexical analyser, which is effectively a finite state machine, recognises.

Markers reward a correct regular expression using + (or a a* ) for one or more a then a single b, and the statement that regular expressions and FSMs describe exactly the same class of languages (the regular languages).

AQA 20214 marksExplain why a regular language cannot describe matched, nested structures such as balanced brackets, and state how a context-free grammar in BNF can describe such structures.

Show worked answer →

A regular language is recognised by a finite state machine, which has only a finite number of states and therefore a bounded amount of memory. To check balanced brackets it would need to count an unbounded number of open brackets so it can match the same number of close brackets, but a finite state machine cannot count without limit because it has finitely many states. So balanced brackets are not a regular language.

A context-free grammar, written in Backus-Naur Form (BNF), can describe such structures because its rules allow recursion: a non-terminal can be defined in terms of itself, for example a bracketed expression defined as an open bracket, an expression, then a close bracket. This recursion lets the grammar generate arbitrarily deep nesting, which a regular expression cannot.

Markers reward the finite-memory or cannot-count argument for the limit of regular languages and the use of recursion in a BNF rule to handle unbounded nesting.

Related dot points

Sources & how we know this

AQA A-level Computer Science (7517) specification — AQA (2015)