How do you measure and model the linear relationship between two variables?
Analyse bivariate data using scatter plots, the sums of squares and products, the product-moment correlation coefficient, and the least-squares regression line, and assess the model with residual plots and the limitations of extrapolation.
A focused answer to the SQA Advanced Higher Statistics bivariate data content: scatter plots, the sums of squares Sxx, Syy and Sxy, the product-moment correlation coefficient, the least-squares regression line, prediction, residual plots and the dangers of extrapolation.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Bivariate analysis studies how two variables move together. The SQA wants you to picture the relationship with a scatter plot, to quantify its strength with the product-moment correlation coefficient, to fit a least-squares regression line, to use it for prediction, and crucially to judge the fit using residual plots and to recognise where prediction is unsafe.
Scatter plots and the sums of squares
A scatter plot shows the form (linear or not), direction (positive or negative) and strength of a relationship at a glance, and it should always come first.
These three quantities are the raw material for both correlation and regression, so computing them carefully is the first calculation in any bivariate question.
The product-moment correlation coefficient
The correlation coefficient measures the strength and direction of a linear relationship.
The least-squares regression line
The regression line of on is the straight line that minimises the sum of the squared vertical residuals.
Residual plots and limitations
A residual is the vertical gap between an observed point and the line, . Plotting residuals against (or against ) checks whether the straight-line model is appropriate.
Try this
Q1. Given , , , find . [2 marks]
- Cue. , a strong positive linear relationship.
Q2. A residual plot of a fitted line shows a clear U-shape. State what this tells you. [1 mark]
- Cue. The relationship is not linear, so a straight-line model is inappropriate and a curved model should be considered.
Exam-style practice questions
Practice questions written in the style of SQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AH style: correlation4 marksFor a sample of pairs, , and . Calculate the product-moment correlation coefficient and describe the relationship.Show worked answer β
Product-moment correlation: (1 mark).
(1 mark), so (1 mark).
Since is close to , there is a strong positive linear relationship: as increases, tends to increase (1 mark). Markers reward the formula, the computed denominator, the value of and a correct interpretation.
AH style: regression4 marksUsing , , and , find the least-squares regression line of on and predict when .Show worked answer β
Gradient: (1 mark).
Intercept from (1 mark).
Line: (1 mark).
Prediction at : (1 mark). Markers reward the gradient, the intercept, the equation and a correct prediction within the data range.
Related dot points
- Calculate and interpret measures of location and dispersion, including the mean, median, quartiles, interquartile range, variance and standard deviation, and use stem-and-leaf plots, boxplots and measures of skewness to describe the shape of a distribution.
A focused answer to the SQA Advanced Higher Statistics exploratory data analysis content: the mean, median and quartiles, the interquartile range, variance and standard deviation, stem-and-leaf plots and boxplots, outlier rules, and how to describe the shape and skewness of a distribution.
- Apply the addition and multiplication laws of probability, calculate conditional probabilities and use tree diagrams, the total probability rule and Bayes' theorem, and test events for independence and mutual exclusivity.
A focused answer to the SQA Advanced Higher Statistics probability content: the addition and multiplication laws, conditional probability, independence and mutual exclusivity, tree diagrams, the total probability rule and Bayes' theorem for reversing a conditional probability.
- Calculate point estimates of a population mean and variance, construct and interpret confidence intervals for a population mean using the normal and Student's t-distributions, and construct a confidence interval for a population proportion.
A focused answer to the SQA Advanced Higher Statistics estimation content: point estimates of the population mean and variance, confidence intervals for a mean using the normal distribution and Student's t-distribution, the role of degrees of freedom, and confidence intervals for a population proportion.
- Carry out the chi-squared goodness-of-fit test and the chi-squared test for association in a contingency table, computing expected frequencies, the chi-squared statistic and degrees of freedom, and interpreting the result against the assumptions.
A focused answer to the SQA Advanced Higher Statistics chi-squared content: the goodness-of-fit test and the test for association in a contingency table, computing expected frequencies, the chi-squared statistic and degrees of freedom, the minimum expected frequency rule, and interpreting the outcome.
Sources & how we know this
- SQA Advanced Higher Statistics Course Specification (C803 77) β SQA (2023)
- SQA Advanced Higher Statistics Data Booklet β SQA (2019)