How do you plot a scatter graph, describe correlation and use a line of best fit to make predictions?
Plot and interpret scatter graphs, describe the type and strength of correlation, draw a line of best fit and use it to estimate values, and understand that correlation does not imply causation.
A focused answer to the WJEC GCSE Mathematics statistics content on scatter graphs, covering plotting bivariate data, describing positive, negative and no correlation, drawing and using a line of best fit for predictions, and the limits of interpolation and extrapolation.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
WJEC asks you to plot bivariate data on a scatter graph, to describe the correlation between the two variables, to draw a line of best fit and use it to estimate values, and to understand that correlation does not prove that one variable causes the other. The exam rewards describing correlation precisely (type and strength), reading predictions off the line, and judging when such a prediction is reliable. It is examined on both components and connects statistics to the straight-line graphs of algebra.
Plotting and describing correlation
A scatter graph reveals whether two quantities are related.
So height against shoe size usually shows positive correlation, while a car's age against its value shows negative correlation. Describe both the type and the strength in the exam.
The line of best fit
A line of best fit summarises the trend with a straight line.
Draw a single straight line that follows the trend with roughly as many points above it as below, passing through the middle of the data (it often passes near the mean point). It need not pass through the origin or any particular point. The line lets you estimate the value for a given (or vice versa): read up to the line and across to the axis.
Making and judging predictions
A prediction's reliability depends on where it lies.
This is a favourite "explain why" mark: a prediction far beyond the data is unreliable because it is extrapolation.
Correlation is not causation
A relationship in the data does not prove cause and effect.
Why this matters
Scatter graphs are reliably examined for a few high-value skills: describing correlation in context, drawing and using a line of best fit, and judging a prediction's reliability with the right vocabulary (interpolation, extrapolation, causation). These are AO2 and AO3 reasoning marks, and the topic links statistics to the algebra of straight-line graphs, since the line of best fit is a trend line. Precise wording, not just "positive" but "positive correlation, so more X is associated with more Y", is what earns full marks.
Exam-style practice questions
Practice questions written in the style of WJEC exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
WJEC 20182 marksA scatter graph plots hours of revision against test score for a class, and the points rise from bottom-left to top-right in a clear band. Describe the correlation and what it means. (Foundation and Higher, Unit 2, calculator.)Show worked answer →
Points rising from bottom-left to top-right show positive correlation, and a clear band means it is strong.
In context: students who revised for longer tended to score more highly.
Markers award a mark for "positive correlation" and a mark for interpreting it in context (more revision is associated with higher scores). Simply writing "positive" without the meaning may not earn the second mark.
WJEC 20213 marksOn a scatter graph with a line of best fit, a student reads off the predicted value for an far beyond all the plotted data. Estimate the value as instructed, then state why this estimate may be unreliable. (Foundation and Higher, Unit 2, calculator.)Show worked answer →
Read up from the given value to the line of best fit, then across to the -axis to estimate the value (the exact figure depends on the line drawn).
The estimate may be unreliable because the value lies outside the range of the data, so this is extrapolation; the pattern is not known to continue beyond the data collected.
Markers give a mark for using the line of best fit to read off the estimate and a mark for explaining that extrapolation outside the data range is unreliable. Saying only "it might be wrong" without naming extrapolation loses the reasoning mark.
Related dot points
- Understand populations and samples, use random and other sampling methods and recognise bias, design data collection including questionnaires, and classify data as qualitative or quantitative and discrete or continuous.
A focused answer to the WJEC GCSE Mathematics statistics content on sampling and data, covering populations and samples, random and other sampling methods, sources of bias, designing questionnaires and classifying data as qualitative or quantitative and discrete or continuous.
- Calculate and interpret the mean, median, mode and range for lists and frequency tables, estimate the mean and identify the modal class from grouped data, and compare distributions using an average and a measure of spread.
A focused answer to the WJEC GCSE Mathematics statistics content on averages and spread, covering the mean, median, mode and range for lists and frequency tables, estimating the mean and modal class from grouped data, and comparing distributions.
- Construct and interpret bar charts, pictograms, vertical line graphs, pie charts and frequency diagrams, and at Higher tier draw and interpret histograms using frequency density for unequal class widths.
A focused answer to the WJEC GCSE Mathematics statistics content on charts and graphs, covering bar charts, pictograms, vertical line graphs, pie charts and frequency diagrams, and histograms with frequency density for unequal class widths at Higher tier.
- Construct and interpret a cumulative frequency curve to estimate the median, quartiles and interquartile range, and draw and compare box plots from five-number summaries (Higher tier).
A focused answer to the WJEC GCSE Mathematics statistics content on cumulative frequency and box plots, covering constructing and reading cumulative frequency curves, estimating the median and quartiles, finding the interquartile range, and drawing and comparing box plots at Higher tier.
Sources & how we know this
- WJEC GCSE Mathematics specification (3300) — WJEC (2015)
- WJEC GCSE Mathematics specification PDF (3300) — WJEC (2015)