Edexcel GCSE Statistics Scatter diagrams and correlation: correlation, regression, Spearman's rank and PMCC
A deep-dive Edexcel GCSE Statistics guide to scatter diagrams and correlation. Covers correlation and causation, lines of best fit and regression, and Spearman's rank and PMCC, with the calculations and exam patterns Edexcel repeats.
Reviewed by: AI editorial process; not yet individually human-reviewed
Jump to a section
What this topic demands
Scatter diagrams show the relationship between two variables (bivariate data). Edexcel tests whether you can describe correlation, resist the trap of assuming causation, fit and use a line of best fit or regression line, and calculate and interpret the correlation coefficients. The headline skills are drawing the line through the double mean point, interpreting its gradient, and the Spearman's rank calculation at Higher tier.
This guide covers the three dot-point pages on scatter diagrams and correlation, then sets out the exam patterns Edexcel repeats.
Correlation and causation
Correlation and causation covers the vocabulary (positive, negative, zero, association, causation, interpolation, extrapolation), describing correlation by inspection as strong or weak, and the crucial idea that correlation does not imply causation. You should recognise spurious correlation and name a likely third factor, and treat interpolation as reliable but extrapolation as risky.
Lines of best fit and regression
Lines of best fit and regression covers drawing the line of best fit through the double mean point , and (at Higher tier) the regression line . You interpret the gradient as a rate of change and the intercept as the value of when , and use the line to predict, with care over extrapolation.
Spearman's rank and PMCC
Spearman's rank and PMCC (Higher tier) covers calculating Spearman's rank correlation coefficient with , interpreting both and the PMCC (which is only interpreted, not calculated), and the distinction between rank correlation (monotonic) and product moment correlation (linear).
How this topic is examined
A typical Edexcel profile:
- Correlation. Describing strength and direction, and explaining why it is not causation.
- Line of best fit. Drawing through the double mean point, forming , and interpreting the gradient.
- Prediction. Using the line, with comment on interpolation versus extrapolation.
- Coefficients. Calculating Spearman's rank and interpreting it and the PMCC.
Check your knowledge
Attempt these under timed conditions, then check against the solutions.
- A scatter diagram of revision hours and test scores rises from lower left to upper right. What correlation is this? (1 mark)
- Does strong correlation prove causation? (1 mark)
- Which point does the line of best fit always pass through? (1 mark)
- In , what does represent? (1 mark)
- For Spearman's rank with and , calculate . (2 marks)
- What does a PMCC of mean? (1 mark)
- Estimating a value beyond the range of the data is called what? (1 mark)
- If Spearman's rank is higher than the PMCC, what does the relationship look like? (1 mark)
Sources & how we know this
- Pearson Edexcel GCSE (9-1) Statistics (1ST0) specification β Pearson Edexcel (2017)