How do you analyse a two-player zero-sum game, find a stable solution, and handle games with no saddle point?
Two-player zero-sum games, the pay-off matrix, play-safe strategies, saddle points and stable solutions, dominance to reduce a game, and mixed strategies including conversion to linear programming.
A focused answer to the AQA A-Level Further Mathematics game theory content, covering two-player zero-sum games, the pay-off matrix, play-safe strategies, saddle points and stable solutions, dominance to reduce a game, and mixed strategies including conversion to a linear programming problem.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
AQA wants you to analyse a two-player zero-sum game from its pay-off matrix, find each player's play-safe strategy, identify a saddle point and stable solution when one exists, reduce a game using dominance, and find optimal mixed strategies, including converting a larger game into a linear programming problem.
Pay-off matrix and play-safe
Saddle points and stability
When the maximin equals the minimax, neither player can do better by changing alone, so the game is stable and the solution is a pair of pure strategies.
Dominance and mixed strategies
Before solving, reduce the game using dominance. A row is dominated if another row is at least as good entry by entry (never worse for the row player), and a dominated row can be deleted because the player would never choose it. Similarly a column is dominated if another column is never worse for the column player (that is, has smaller or equal entries). Repeatedly removing dominated rows and columns can shrink a large game to a one that is solvable by hand.
If no saddle point exists after reduction, each player uses a mixed strategy, choosing each option with a fixed probability so that the opponent cannot exploit a predictable pattern.
For larger games that do not reduce to , convert to a linear programming problem: add a constant to every pay-off so that all entries are positive (which does not change the optimal strategies, only shifts the value), then set up variables for the strategy probabilities and maximise the game value subject to the constraints that each pure-column expected pay-off is at least the value.
The reason the constant is needed is that the linear programming formulation treats the game value as a variable to be maximised, and the standard simplex method assumes a non-negative objective. Shifting every entry by a positive constant makes the value provably positive, so dividing through by to scale the probability variables is safe (you cannot divide by zero or by a negative). After solving, subtract to recover the true value of the original game. The strategy probabilities themselves are unaffected by the shift, because adding a constant to every outcome changes what each player wins by the same amount whatever they do, so it cannot change which mix is best.
Why mixed strategies work
A mixed strategy defends against an opponent who could otherwise exploit a predictable choice. If the row player always picked the row with the largest single entry, the column player would simply respond with the column that punishes that row hardest. By randomising in the right proportions, the row player makes every column the opponent might choose yield the same expected pay-off, so the opponent has no profitable reply and cannot push the outcome below the game value. This is the meaning of the equalising condition used in the worked solution: the optimal is the one that leaves the column player indifferent. The same logic, applied to the columns, gives the column player's optimal mix, and the two values agree at the game value , which is the central guarantee of the minimax theorem for zero-sum games.
A useful check on a solution is that the probabilities lie strictly between and . If solving for gives a value outside , the game in fact has a saddle point and one of the pure strategies is optimal after all, so you should return to the maximin and minimax comparison rather than forcing a mixed solution.
The standard exam sequence is therefore: write the pay-off matrix from the row player's viewpoint, check for a saddle point by comparing maximin and minimax, reduce by dominance if no saddle point is found, then either solve the resulting game by equating expected pay-offs or set up a linear program for a larger one. Keep the constant you add for the linear program in mind so you can subtract it back to recover the true game value at the end.
Exam-style practice questions
Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AQA 20206 marksA two-player zero-sum game has pay-off matrix (to the row player) . Show there is no stable solution, then find the row player's optimal mixed strategy and the value of the game.Show worked answer β
Check for a saddle point. Row minimums are and , so the maximin is . Column maximums are and , so the minimax is . Since there is no saddle point, so no stable (pure) solution exists.
Let the row player play row 1 with probability and row 2 with probability . Equate expected pay-offs against each column.
If the column player picks column 1: .
If the column player picks column 2: .
Set equal: , so and .
The value is . So the row player plays row 1 with probability and row 2 with probability , for a game value of .
Markers reward the saddle-point check, equating the two expected pay-offs, solving for , and the game value.
AQA 20224 marksExplain how the play-safe strategies of the two players are found in a zero-sum game, and state the condition under which the game has a stable solution.Show worked answer β
The row player examines the minimum entry in each row (the worst that can happen for that choice) and selects the row whose minimum is largest. This maximin strategy guarantees the best secured outcome.
The column player examines the maximum entry in each column (the worst, since pay-offs are losses to the column player) and selects the column whose maximum is smallest. This is the minimax strategy.
The game has a stable solution (a saddle point) when the maximin equals the minimax. At that entry neither player can improve by changing alone, so it is the value of the game.
Markers reward the maximin description, the minimax description, and the maximin equals minimax condition for stability.
Related dot points
- Formulating linear programming problems, the feasible region and graphical solution, the vertex (objective line) method, slack variables, and the simplex algorithm for maximisation.
A focused answer to the AQA A-Level Further Mathematics linear programming content, covering formulating problems, the feasible region and graphical solution, the vertex and objective line methods, slack variables, and the simplex algorithm for maximisation.
- Graph terminology including vertices, edges, degree, paths and cycles, special graphs such as trees and complete graphs, the adjacency matrix representation, and Eulerian and Hamiltonian graphs.
A focused answer to the AQA A-Level Further Mathematics graphs and networks content, covering graph terminology such as vertices, edges, degree, paths and cycles, special graphs such as trees and complete graphs, the adjacency matrix representation, and Eulerian and Hamiltonian graphs.
- Kruskal's and Prim's algorithms for minimum spanning trees, Dijkstra's algorithm for the shortest path, and the route inspection and travelling salesperson problems.
A focused answer to the AQA A-Level Further Mathematics network algorithms content, covering Kruskal's and Prim's algorithms for minimum spanning trees, Dijkstra's algorithm for the shortest path, and the route inspection and travelling salesperson problems.
- Activity networks, forward and backward passes to find earliest and latest times, the critical path, float, and resource scheduling with Gantt charts.
A focused answer to the AQA A-Level Further Mathematics critical path analysis content, covering activity networks, forward and backward passes to find earliest and latest times, the critical path, float, and resource scheduling with Gantt charts.
Sources & how we know this
- AQA A-level Further Mathematics (7367) specification β AQA (2017)