How do you interpret correlation on a scatter graph, draw a line of best fit, and use it to make predictions?
Plot and interpret scatter graphs; describe correlation; draw a line of best fit; use it to estimate values; and understand interpolation, extrapolation and the difference between correlation and causation.
A focused answer to the OCR GCSE Mathematics statistics content on scatter graphs and correlation, covering positive, negative and no correlation, drawing a line of best fit, making predictions, and the difference between correlation and causation.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
OCR reference S6 covers scatter graphs and correlation: plotting paired data, describing correlation, drawing a line of best fit, using it to make predictions, and understanding the limits of those predictions and the difference between correlation and causation. Scatter graphs are how bivariate data (two variables together) is displayed and interpreted. This content appears on every tier and is a reliable source of AO2 and AO3 marks, because it rewards clear interpretation and reasoning.
Plotting and describing correlation
A scatter graph reveals the relationship between two variables.
So height against shoe size usually shows positive correlation, while the age of a car against its value shows negative correlation. Describing correlation fully means naming the direction (positive or negative) and, ideally, the strength, and then stating what it means in context, which is the part that earns the interpretation marks.
The line of best fit
A line of best fit summarises the trend with a straight line.
A line of best fit is drawn to pass as close as possible to all the points, with roughly equal numbers above and below, and it should pass through the mean point of the data. It does not have to pass through the origin or any particular point. Once drawn, it is used to estimate: to predict the -value for a given , find the on the horizontal axis, go up to the line, and read across to the vertical axis (and the reverse to predict an from a ).
Predictions, interpolation and extrapolation
Predictions from a line of best fit have limits.
So if the data covers to hours of study, estimating the score for hours (interpolation) is reasonable, but estimating for hours (extrapolation) is unsafe, because the pattern may break down. OCR rewards noting that an out-of-range prediction is unreliable, which is a common follow-up question.
Correlation versus causation
A correlation does not prove one variable causes the other.
Two variables can be correlated without one causing the other, often because a third factor influences both. Ice cream sales and drowning incidents are correlated, but neither causes the other; hot weather drives both. OCR sets questions where a claimed causal link must be challenged, and the marks reward explaining that correlation only shows the variables change together and that a third variable, or the reverse direction, may be the real explanation.
Exam-style practice questions
Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
OCR 20194 marksA scatter graph shows the hours studied and the test score for students, with a positive correlation. (a) Describe the correlation and what it means. (b) A line of best fit is drawn. Explain how to use it to estimate the score of a student who studied hours. (Foundation, Paper 1, calculator.)Show worked answer →
(a) There is positive correlation: as the hours studied increase, the test score tends to increase too. In context, students who study more tend to score higher.
(b) Find hours on the horizontal axis, go up to the line of best fit, then across to the vertical axis to read off the estimated score.
Markers award marks for naming positive correlation, for the contextual meaning, for the read-up-and-across method, and for noting it is an estimate. Saying "they are linked" without naming the direction is too vague.
OCR 20213 marksA scatter graph shows ice cream sales and temperature, with strong positive correlation. A student claims that buying more ice cream causes the temperature to rise. Explain why this claim is wrong, using the idea of correlation and causation. (Higher, Paper 4, calculator.)Show worked answer →
Correlation means two variables change together, but it does not prove that one causes the other.
Here, higher temperatures cause both more ice cream sales and the warm weather itself; ice cream sales do not cause the temperature to change. A third factor (the weather) drives both.
Markers give a mark for stating that correlation is not causation, a mark for explaining the real direction (temperature drives sales), and a mark for identifying that a third variable can be responsible. Simply repeating "correlation is not causation" without the explanation does not score fully.
Related dot points
- Identify types of data (qualitative and quantitative, discrete and continuous); understand populations and samples; use random and stratified sampling; and recognise sources of bias.
A focused answer to the OCR GCSE Mathematics statistics content on sampling and data, covering types of data, populations and samples, random and stratified sampling, and recognising bias in data collection.
- Calculate the mean, median, mode and range; find the mean from a frequency table and an estimated mean from grouped data; and compare distributions using an average and the range (and quartiles at Higher tier).
A focused answer to the OCR GCSE Mathematics statistics content on averages and spread, covering the mean, median, mode and range, the mean from frequency tables, the estimated mean from grouped data, and comparing distributions.
- Draw and interpret statistical charts including bar charts, pie charts, frequency polygons, stem-and-leaf diagrams, box plots and histograms with unequal class widths (histograms at Higher tier).
A focused answer to the OCR GCSE Mathematics statistics content on statistical charts and graphs, covering bar charts, pie charts, frequency polygons, stem-and-leaf diagrams, box plots, and histograms with frequency density at Higher tier.
- Use the equation to find the gradient and intercept; find the equation of a line through given points; and identify parallel and perpendicular lines (perpendicular at Higher tier).
A focused answer to the OCR GCSE Mathematics algebra content on straight line graphs, covering gradient and intercept, the equation of a line through points, and parallel and perpendicular lines.
- Use relative frequency (experimental probability) to estimate probabilities from data, understand how more trials improve the estimate, and calculate expected numbers of outcomes.
A focused answer to the OCR GCSE Mathematics probability content on relative frequency and expected outcomes, covering experimental probability, the effect of more trials, fairness, and calculating expected numbers of outcomes.
Sources & how we know this
- OCR GCSE (9-1) Mathematics (J560) specification — OCR (2015)