Skip to main content
EnglandStatistics

Edexcel GCSE Statistics Scatter diagrams and correlation: correlation, regression, Spearman's rank and PMCC

A deep-dive Edexcel GCSE Statistics guide to scatter diagrams and correlation. Covers correlation and causation, lines of best fit and regression, and Spearman's rank and PMCC, with the calculations and exam patterns Edexcel repeats.

Generated by Claude Opus 4.813 min read1ST0 Topic 2e

Reviewed by: AI editorial process; not yet individually human-reviewed

Jump to a section
  1. What this topic demands
  2. Correlation and causation
  3. Lines of best fit and regression
  4. Spearman's rank and PMCC
  5. How this topic is examined
  6. Check your knowledge

What this topic demands

Scatter diagrams show the relationship between two variables (bivariate data). Edexcel tests whether you can describe correlation, resist the trap of assuming causation, fit and use a line of best fit or regression line, and calculate and interpret the correlation coefficients. The headline skills are drawing the line through the double mean point, interpreting its gradient, and the Spearman's rank calculation at Higher tier.

This guide covers the three dot-point pages on scatter diagrams and correlation, then sets out the exam patterns Edexcel repeats.

Correlation and causation

Correlation and causation covers the vocabulary (positive, negative, zero, association, causation, interpolation, extrapolation), describing correlation by inspection as strong or weak, and the crucial idea that correlation does not imply causation. You should recognise spurious correlation and name a likely third factor, and treat interpolation as reliable but extrapolation as risky.

Lines of best fit and regression

Lines of best fit and regression covers drawing the line of best fit through the double mean point (xˉ,yˉ)(\bar{x}, \bar{y}), and (at Higher tier) the regression line y=a+bxy = a + bx. You interpret the gradient as a rate of change and the intercept as the value of yy when x=0x = 0, and use the line to predict, with care over extrapolation.

Spearman's rank and PMCC

Spearman's rank and PMCC (Higher tier) covers calculating Spearman's rank correlation coefficient with rs=1βˆ’6βˆ‘d2n(n2βˆ’1)r_s = 1 - \frac{6 \sum d^2}{n(n^2 - 1)}, interpreting both rsr_s and the PMCC (which is only interpreted, not calculated), and the distinction between rank correlation (monotonic) and product moment correlation (linear).

How this topic is examined

A typical Edexcel profile:

  • Correlation. Describing strength and direction, and explaining why it is not causation.
  • Line of best fit. Drawing through the double mean point, forming y=a+bxy = a + bx, and interpreting the gradient.
  • Prediction. Using the line, with comment on interpolation versus extrapolation.
  • Coefficients. Calculating Spearman's rank and interpreting it and the PMCC.

Check your knowledge

Attempt these under timed conditions, then check against the solutions.

  1. A scatter diagram of revision hours and test scores rises from lower left to upper right. What correlation is this? (1 mark)
  2. Does strong correlation prove causation? (1 mark)
  3. Which point does the line of best fit always pass through? (1 mark)
  4. In y=a+bxy = a + bx, what does bb represent? (1 mark)
  5. For Spearman's rank with βˆ‘d2=10\sum d^2 = 10 and n=5n = 5, calculate rsr_s. (2 marks)
  6. What does a PMCC of +1+1 mean? (1 mark)
  7. Estimating a value beyond the range of the data is called what? (1 mark)
  8. If Spearman's rank is higher than the PMCC, what does the relationship look like? (1 mark)

Sources & how we know this

  • statistics
  • gcse-edexcel
  • edexcel-statistics
  • scatter-diagrams-and-correlation
  • gcse
  • correlation
  • regression
  • spearman-rank
  • pmcc