Skip to main content
EnglandFurther MathsSyllabus dot point

How do you analyse a two-player zero-sum game, find a stable solution, and handle games with no saddle point?

Two-player zero-sum games, the pay-off matrix, play-safe strategies, saddle points and stable solutions, dominance to reduce a game, and mixed strategies including conversion to linear programming.

A focused answer to the AQA A-Level Further Mathematics game theory content, covering two-player zero-sum games, the pay-off matrix, play-safe strategies, saddle points and stable solutions, dominance to reduce a game, and mixed strategies including conversion to a linear programming problem.

Generated by Claude Opus 4.811 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Pay-off matrix and play-safe
  3. Saddle points and stability
  4. Dominance and mixed strategies
  5. Why mixed strategies work

What this dot point is asking

AQA wants you to analyse a two-player zero-sum game from its pay-off matrix, find each player's play-safe strategy, identify a saddle point and stable solution when one exists, reduce a game using dominance, and find optimal mixed strategies, including converting a larger game into a linear programming problem.

Pay-off matrix and play-safe

Saddle points and stability

When the maximin equals the minimax, neither player can do better by changing alone, so the game is stable and the solution is a pair of pure strategies.

Dominance and mixed strategies

Before solving, reduce the game using dominance. A row is dominated if another row is at least as good entry by entry (never worse for the row player), and a dominated row can be deleted because the player would never choose it. Similarly a column is dominated if another column is never worse for the column player (that is, has smaller or equal entries). Repeatedly removing dominated rows and columns can shrink a large game to a 2Γ—22\times 2 one that is solvable by hand.

If no saddle point exists after reduction, each player uses a mixed strategy, choosing each option with a fixed probability so that the opponent cannot exploit a predictable pattern.

For larger games that do not reduce to 2Γ—22\times 2, convert to a linear programming problem: add a constant to every pay-off so that all entries are positive (which does not change the optimal strategies, only shifts the value), then set up variables for the strategy probabilities and maximise the game value subject to the constraints that each pure-column expected pay-off is at least the value.

The reason the constant is needed is that the linear programming formulation treats the game value VV as a variable to be maximised, and the standard simplex method assumes a non-negative objective. Shifting every entry by a positive constant kk makes the value provably positive, so dividing through by VV to scale the probability variables is safe (you cannot divide by zero or by a negative). After solving, subtract kk to recover the true value of the original game. The strategy probabilities themselves are unaffected by the shift, because adding a constant to every outcome changes what each player wins by the same amount whatever they do, so it cannot change which mix is best.

Why mixed strategies work

A mixed strategy defends against an opponent who could otherwise exploit a predictable choice. If the row player always picked the row with the largest single entry, the column player would simply respond with the column that punishes that row hardest. By randomising in the right proportions, the row player makes every column the opponent might choose yield the same expected pay-off, so the opponent has no profitable reply and cannot push the outcome below the game value. This is the meaning of the equalising condition E1=E2E_1 = E_2 used in the worked solution: the optimal pp is the one that leaves the column player indifferent. The same logic, applied to the columns, gives the column player's optimal mix, and the two values agree at the game value VV, which is the central guarantee of the minimax theorem for zero-sum games.

A useful check on a 2Γ—22 \times 2 solution is that the probabilities lie strictly between 00 and 11. If solving for pp gives a value outside [0,1][0, 1], the game in fact has a saddle point and one of the pure strategies is optimal after all, so you should return to the maximin and minimax comparison rather than forcing a mixed solution.

The standard exam sequence is therefore: write the pay-off matrix from the row player's viewpoint, check for a saddle point by comparing maximin and minimax, reduce by dominance if no saddle point is found, then either solve the resulting 2Γ—22\times 2 game by equating expected pay-offs or set up a linear program for a larger one. Keep the constant you add for the linear program in mind so you can subtract it back to recover the true game value at the end.

Exam-style practice questions

Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AQA 20206 marksA two-player zero-sum game has pay-off matrix (to the row player) (2βˆ’1βˆ’34)\begin{pmatrix} 2 & -1 \\ -3 & 4 \end{pmatrix}. Show there is no stable solution, then find the row player's optimal mixed strategy and the value of the game.
Show worked answer β†’

Check for a saddle point. Row minimums are βˆ’1-1 and βˆ’3-3, so the maximin is βˆ’1-1. Column maximums are 22 and 44, so the minimax is 22. Since βˆ’1β‰ 2-1 \neq 2 there is no saddle point, so no stable (pure) solution exists.

Let the row player play row 1 with probability pp and row 2 with probability 1βˆ’p1 - p. Equate expected pay-offs against each column.

If the column player picks column 1: E1=2pβˆ’3(1βˆ’p)=5pβˆ’3E_1 = 2p - 3(1 - p) = 5p - 3.
If the column player picks column 2: E2=βˆ’p+4(1βˆ’p)=4βˆ’5pE_2 = -p + 4(1 - p) = 4 - 5p.

Set equal: 5pβˆ’3=4βˆ’5p5p - 3 = 4 - 5p, so 10p=710p = 7 and p=0.7p = 0.7.

The value is E1=5(0.7)βˆ’3=0.5E_1 = 5(0.7) - 3 = 0.5. So the row player plays row 1 with probability 0.70.7 and row 2 with probability 0.30.3, for a game value of 0.50.5.

Markers reward the saddle-point check, equating the two expected pay-offs, solving for pp, and the game value.

AQA 20224 marksExplain how the play-safe strategies of the two players are found in a zero-sum game, and state the condition under which the game has a stable solution.
Show worked answer β†’

The row player examines the minimum entry in each row (the worst that can happen for that choice) and selects the row whose minimum is largest. This maximin strategy guarantees the best secured outcome.

The column player examines the maximum entry in each column (the worst, since pay-offs are losses to the column player) and selects the column whose maximum is smallest. This is the minimax strategy.

The game has a stable solution (a saddle point) when the maximin equals the minimax. At that entry neither player can improve by changing alone, so it is the value of the game.

Markers reward the maximin description, the minimax description, and the maximin equals minimax condition for stability.

Related dot points

Sources & how we know this