Skip to main content
Northern IrelandFurther MathsSyllabus dot point

How do you describe and use the relationship between two variables with correlation and a regression line?

Analyse bivariate data: draw and interpret scatter graphs, describe correlation, find and use a regression line, and understand interpolation and extrapolation.

A CCEA GCSE Further Mathematics answer on bivariate analysis, covering scatter graphs and correlation, the line of best fit and regression, using the line to predict, and the limits of extrapolation in the Statistics unit.

Generated by Claude Opus 4.812 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Scatter graphs and correlation
  3. The regression line
  4. Predicting with the line
  5. Interpolation, extrapolation and causation
  6. Why this matters

What this dot point is asking

Bivariate data is data on two variables measured together, and CCEA GCSE Further Mathematics studies the relationship between them. You must draw and interpret scatter graphs, describe the type and strength of correlation, find and use a regression line to make predictions, and understand the difference between interpolation (safe) and extrapolation (unreliable). This applies statistical reasoning to paired measurements and rewards careful interpretation.

Scatter graphs and correlation

A scatter graph shows each pair of values as a point, and the pattern of points reveals whether the two variables move together. The direction and tightness of the pattern describe the correlation.

Describing correlation in words, naming both its direction and its strength in the context of the variables, is a frequently examined skill.

The regression line

When there is correlation, a regression line (line of best fit) captures the trend as a straight line through the data. Its gradient and intercept have meanings in the context of the variables.

The gradient is usually the more meaningful figure, telling you the rate at which yy changes with xx, while the intercept can be less useful if x=0x = 0 is far from the data.

Predicting with the line

The regression line lets you estimate one variable from the other by substituting into the equation. Whether the estimate is trustworthy depends on whether you stay within the range of the data.

Interpolation, extrapolation and causation

Two cautions matter for interpretation. Interpolation, predicting within the range of the data, is generally safe because the trend is supported by evidence there; extrapolation, predicting beyond the range, is unreliable because the relationship may change. Separately, correlation does not prove causation: two variables can move together because both depend on a third factor, so a strong correlation alone is not evidence that one causes the other. Stating these limits clearly is exactly the reasoning CCEA rewards.

Why this matters

Bivariate analysis applies statistical thinking to relationships rather than single variables, and it is the data-handling counterpart to the algebra of straight lines from the Pure unit: the regression line is just y=mx+cy = mx + c fitted to data. The emphasis on interpreting the gradient, judging reliability and resisting the causation fallacy develops the critical reasoning that statistics is meant to teach, which carries weight across the unit.

Exam-style practice questions

Practice questions written in the style of CCEA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

CCEA Unit 3 (style)4 marksA regression line of yy on xx is y=3.2+1.5xy = 3.2 + 1.5x. Use it to estimate yy when x=8x = 8, and interpret the gradient.
Show worked answer →

Substitute x=8x = 8: y=3.2+1.5(8)=3.2+12=15.2y = 3.2 + 1.5(8) = 3.2 + 12 = 15.2.

The gradient 1.51.5 means that for each increase of 11 in xx, the predicted yy increases by 1.51.5.

Marks are for the substitution, the predicted value, and a correct interpretation of the gradient in context. A common error is to interpret the intercept 3.23.2 as the gradient.

CCEA Unit 3 (style)4 marksA scatter graph of revision hours against test score shows strong positive correlation. A student suggests the line can predict the score for 4040 hours, well beyond the data range of 00 to 1010 hours. Comment on this, and describe the correlation.
Show worked answer →

The correlation is strong and positive: as revision hours increase, the test score tends to increase.

Predicting at 4040 hours is extrapolation, far outside the range 00 to 1010 used to build the line. The relationship may not continue to hold beyond the data, so the prediction is unreliable.

Marks are for describing the correlation and for explaining that extrapolation beyond the data range is unreliable. Saying the prediction is fine because the correlation is strong is the usual mistake.

Related dot points

Sources & how we know this