Skip to main content
Northern IrelandStatistics

Scatter diagrams and correlation - CCEA GCSE Statistics guide to correlation, the line of best fit and Spearman's rank

A CCEA GCSE Statistics guide to scatter diagrams and correlation: types and strength of correlation, the line of best fit, interpolation versus extrapolation, correlation versus causation, and calculating and interpreting Spearman's rank correlation coefficient.

Generated by Claude Opus 4.811 min read2260 Unit 2

Reviewed by: AI editorial process; not yet individually human-reviewed

Jump to a section
  1. Scatter diagrams and correlation
  2. The line of best fit
  3. Correlation and causation
  4. Spearman's rank correlation coefficient
  5. How CCEA examines correlation

Scatter diagrams and correlation are how CCEA GCSE Statistics investigates the relationship between two variables. This guide covers describing correlation, the line of best fit, the causation trap and Spearman's rank correlation coefficient.

Scatter diagrams and correlation

A scatter diagram plots each pair of values as a point, with the explanatory variable on the horizontal axis. The pattern reveals correlation: positive when points rise from left to right, negative when they fall, and none when there is no clear pattern. Correlation is also strong, with points close to a line, or weak, with points loosely scattered. A good description gives both the type and the strength in context, such as strong positive correlation between height and arm span.

The line of best fit

A line of best fit is a straight line that follows the trend of the points, with roughly as many points above as below, and it should pass through the mean point. It is used to estimate values: interpolation, estimating within the range of the data, is reliable, while extrapolation, estimating beyond the range, is unreliable because the trend may not continue. The stronger the correlation, the more reliable the estimate, and even an interpolated estimate from weak correlation should be treated with caution.

Correlation and causation

A scatter diagram can show that two variables are correlated, but it does not prove that one causes the other. The link may be caused by a third factor that affects both, or it may be coincidental. CCEA examines this point directly, so a careful answer describes the correlation, refuses to claim causation from correlation alone, and where possible names a plausible third factor that explains the association.

Spearman's rank correlation coefficient

Spearman's rank correlation coefficient measures how well two rankings agree. Rank each set of data, find the difference in ranks for each item, square the differences and total them, then apply the formula one minus six times the sum of d squared over n times n squared minus one. The coefficient runs from minus one for perfect disagreement, through zero for no association, to plus one for perfect agreement, so values near the extremes show strong agreement or disagreement. If the data is given as raw values you must convert it to ranks first, sharing ranks for any ties.

How CCEA examines correlation

CCEA rewards describing correlation by type and strength, drawing and using a line of best fit, recognising the limits of extrapolation, the correlation-versus-causation reasoning point, and calculating and interpreting Spearman's rank. Bivariate analysis also runs through the Unit 2 case study on real data. Use the dot point for specification-level detail and worked CCEA-style questions, then test yourself with the quiz.

Sources & how we know this

  • statistics
  • ccea-gcse
  • ccea-statistics
  • correlation
  • scatter-diagram
  • spearman
  • gcse