How do you interpret correlation on a scatter graph, draw a line of best fit, and use it to make predictions?
Plot and interpret scatter graphs; describe correlation; draw a line of best fit; use it to estimate values; and understand interpolation, extrapolation and the difference between correlation and causation.
A focused answer to the Eduqas GCSE Mathematics statistics content on scatter graphs and correlation, covering plotting and interpreting scatter graphs, describing correlation, the line of best fit, interpolation and extrapolation, and correlation versus causation.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
The Eduqas statistics content asks you to plot and interpret scatter graphs, describe the correlation between two variables, draw a line of best fit, use it to estimate values, and understand interpolation, extrapolation and the crucial distinction between correlation and causation. Scatter graphs reveal whether two quantities are related, which is a core idea in science and the social sciences. It appears at both tiers, and the line-of-best-fit estimate plus the correlation-versus-causation reasoning are dependable questions, with the latter being exactly the AO2 reasoning Eduqas rewards.
Plotting and interpreting scatter graphs
A scatter graph plots one variable against another as points, with no line joining them, to reveal any relationship.
The pattern is read by eye: points clustered along a rising line show a relationship; points scattered randomly show none. The closer the points lie to a straight line, the stronger the correlation.
Describing correlation
Correlation has both a direction and a strength.
So height against shoe size shows positive correlation, while car age against value shows negative correlation. A full description gives both the strength and the direction (for example "strong negative correlation"), and a good answer interprets it in context.
The line of best fit and predictions
A line of best fit is a single straight line that follows the trend of the points.
To predict, read from a known value on one axis up to the line and across to the other axis. The reliability depends on where you read.
Interpolation, extrapolation and causation
Two ideas govern how far you can trust a prediction, and one warns against a tempting error.
Reading a value within the range of the data is interpolation, which is reliable because the line is supported by the plotted points there. Reading beyond the range is extrapolation, which is unreliable because the trend may not continue. Finally, correlation does not prove causation: two variables can be correlated because a third factor influences both, so a correlation between ice-cream sales and sunburn does not mean ice cream causes sunburn (hot weather drives both). Eduqas tests this distinction directly, so never claim one variable causes another from correlation alone.
Exam-style practice questions
Practice questions written in the style of WJEC Eduqas exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Eduqas 20193 marksA scatter graph plots the number of hours studied against the test mark for 12 students. The points rise from bottom left to top right in a fairly tight band. Describe the correlation and explain what it means in context. (Foundation, Component 2, calculator.)Show worked answer →
Rising from bottom left to top right means positive correlation, and a tight band means the correlation is strong.
So there is a strong positive correlation between hours studied and test mark.
In context, this means that students who studied for longer tended to score higher marks.
Markers award a mark for "positive", a mark for "strong", and a mark for the contextual interpretation. Describing only the direction without the strength, or giving an interpretation with no mention of the variables, loses marks.
Eduqas 20224 marksOn a scatter graph of car age against value with a line of best fit, a student uses the line to estimate the value of a 4-year-old car (within the data range) and a 20-year-old car (well beyond it). Explain which estimate is reliable and why, using the terms interpolation and extrapolation. (Higher, Component 2, calculator.)Show worked answer →
The 4-year-old estimate is interpolation, reading within the range of the data, so it is reliable because the line of best fit is supported by the plotted points there.
The 20-year-old estimate is extrapolation, reading beyond the data, so it is unreliable because the trend may not continue outside the range that was measured.
Markers give marks for correctly labelling interpolation and extrapolation, for stating that interpolation is reliable, and for explaining why extrapolation is not. Mixing up the two terms is the common error.
Related dot points
- Identify types of data (qualitative and quantitative, discrete and continuous); understand populations and samples; use random and stratified sampling; and recognise sources of bias.
A focused answer to the Eduqas GCSE Mathematics statistics content on sampling and data, covering qualitative and quantitative data, discrete and continuous data, populations and samples, random and stratified sampling, and sources of bias.
- Calculate the mean, median, mode and range; find the mean from a frequency table and an estimated mean from grouped data; and compare distributions using an average and the range (and quartiles at Higher tier).
A focused answer to the Eduqas GCSE Mathematics statistics content on averages and spread, covering the mean median mode and range, the mean from a frequency table, the estimated mean from grouped data, and comparing distributions with quartiles at Higher tier.
- Draw and interpret statistical charts including bar charts, pie charts, frequency polygons, stem-and-leaf diagrams, box plots and histograms with unequal class widths (histograms at Higher tier).
A focused answer to the Eduqas GCSE Mathematics statistics content on charts and graphs, covering bar charts, pie charts, frequency polygons, stem-and-leaf diagrams, box plots, and histograms with unequal class widths and frequency density at Higher tier.
- Use the equation to find the gradient and intercept; find the equation of a line through given points; and identify parallel and perpendicular lines (perpendicular at Higher tier).
A focused answer to the Eduqas GCSE Mathematics algebra content on straight line graphs, covering gradient and intercept from y equals mx plus c, finding a line through two points, and parallel and perpendicular lines.
- Solve problems involving direct and inverse proportion, including using the unitary method and forming proportion equations of the form or with a constant of proportionality (Higher tier).
A focused answer to the Eduqas GCSE Mathematics ratio content on direct and inverse proportion, covering the unitary method, forming proportion equations with a constant of proportionality, and proportion to powers at Higher tier.
Sources & how we know this
- WJEC Eduqas GCSE (9-1) Mathematics specification (C300) — WJEC Eduqas (2015)