Skip to main content
EnglandStatisticsSyllabus dot point

What does correlation tell you, and why does it not prove causation?

Vocabulary of correlation (positive, negative, zero, causation, association, interpolation, extrapolation); describing correlation by inspection as strong or weak; correlation does not imply causation; spurious correlation.

A focused answer to Edexcel GCSE Statistics on correlation, covering the vocabulary of correlation, describing correlation by inspection as strong or weak and positive or negative, why correlation does not imply causation, spurious correlation, and interpolation versus extrapolation.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. The vocabulary of correlation
  3. Strength of correlation by inspection
  4. Correlation does not imply causation
  5. Spurious correlation
  6. Interpolation and extrapolation

What this dot point is asking

Edexcel codes 2e.01 to 2e.03 require you to know and apply the vocabulary of correlation (positive, negative, zero, causation, association, interpolation, extrapolation), to describe correlation by inspection as strong or weak, and to understand that correlation does not imply causation, including being aware of spurious correlation. These ideas are the foundation for the line of best fit and the rank and product moment coefficients on the other pages of this module.

The vocabulary of correlation

On a scatter diagram the explanatory (independent) variable is plotted on the xx axis and the response (dependent) variable on the yy axis. The direction of the cloud of points (uphill, downhill or shapeless) tells you the type of correlation.

Strength of correlation by inspection

Edexcel asks you to judge strength by inspection (no calculation needed at this stage). The closer the points lie to a straight line, the stronger the correlation:

  • Strong correlation: points lie close to a clear straight line.
  • Weak correlation: points show a trend but are widely scattered.
  • Zero / no correlation: points show no straight-line pattern.

You combine strength and direction, for example "strong negative correlation" or "weak positive correlation", and always describe it in the context of the variables.

Correlation does not imply causation

This is one of the most heavily tested ideas in the whole qualification. Even strong correlation does not prove that one variable causes the other. There are three possibilities whenever two variables correlate:

  1. One genuinely causes the other (age of a car causing its value to fall).
  2. A third factor causes both (hot weather causing both ice cream sales and swimming).
  3. The link is coincidental.

Edexcel expects you to state explicitly that correlation does not imply causation and, where relevant, to suggest a plausible third factor.

Spurious correlation

A classic example is that the number of storks and the number of babies across towns may correlate, simply because larger towns have more of both. Recognising spurious correlation, and naming the hidden third variable, is exactly what extended-response questions reward.

Interpolation and extrapolation

Interpolation is estimating a value within the range of the data, which is generally reliable. Extrapolation is estimating beyond the range, which is risky because the pattern may not continue. Edexcel expects you to be cautious about extrapolation when using a line of best fit to make predictions.

Exam-style practice questions

Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Edexcel 1ST0 20193 marksA scatter diagram shows the ice cream sales and the number of people swimming at a beach on 2020 days. The points show strong positive correlation. (a) Describe the correlation. (b) A student says 'buying ice cream makes people swim'. Explain why this conclusion is not justified.
Show worked answer →

(a) Strong positive correlation: as ice cream sales increase, the number of people swimming also increases, and the points lie close to a straight line.

(b) Correlation does not imply causation. Both ice cream sales and swimming are likely caused by a third factor (hot weather), so the relationship is an association, not evidence that one causes the other.

Markers reward describing the correlation (strong, positive), stating that correlation does not imply causation, and identifying a likely third factor.

Edexcel 1ST0 20214 marksFor each pair of variables, state the type of correlation you would expect (positive, negative or none) and say whether any correlation is likely to be causal. (a) A car's age and its value. (b) The number of storks and the number of babies born in a set of towns.
Show worked answer →

(a) Negative correlation: as a car gets older its value falls. This is likely causal, because age and wear directly reduce a car's worth.

(b) Positive correlation may appear, but it is spurious: more storks and more babies are both linked to town size (bigger towns have more of both), so there is no causal link.

Markers reward the correct correlation type for each, a valid comment on causation for (a), and recognising spurious correlation with a third factor for (b).

Related dot points

Sources & how we know this