An outlier is a value far from the rest. Two common rules are: more than 1.5 × IQR beyond a quartile, or more than two standard deviations from the mean. The question states which rule to use.

EnglandMathsSyllabus dot point

How do you summarise, display and compare data, and identify outliers and correlation?

Measures of central tendency and spread, histograms, box plots and cumulative frequency, identifying outliers, comparing distributions, and correlation and the regression line.

A focused answer to the OCR A-Level Mathematics A data presentation content, covering the mean, median and mode, range, interquartile range, variance and standard deviation, histograms, box plots and cumulative frequency, identifying outliers, comparing distributions, and interpreting correlation and the regression line.

Generated by Claude Opus 4.812 min answerUpdated 2026-06-02

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this dot point is asking
The answer
Examples in context
Try this

What this dot point is asking

OCR wants you to calculate and interpret measures of central tendency (mean, median, mode) and spread (range, interquartile range, variance, standard deviation), draw and read histograms (with frequency density), box plots and cumulative frequency graphs, identify outliers by a stated rule, compare two distributions, and interpret correlation and the least-squares regression line, including the dangers of extrapolation.

The answer

Averages and spread

The mean uses every value but is sensitive to outliers; the median is the middle value and is resistant to outliers; the mode is the most common value. The standard deviation measures typical distance from the mean, while the interquartile range measures the spread of the middle half and ignores extremes.

Histograms and frequency density

A histogram shows grouped continuous data with area proportional to frequency, so the vertical axis is frequency density, not frequency.

Outliers

An outlier is a value far from the rest. Two common rules are: more than $1.5 \times \text{IQR}$ beyond a quartile, or more than two standard deviations from the mean. The question states which rule to use.

Box plots and skewness

A box plot shows the minimum, the three quartiles and the maximum. Comparing the median's position within the box describes skewness: a median nearer $Q_1$ indicates positive skew, nearer $Q_3$ indicates negative skew.

Examples in context

Comparing two distributions

To compare data sets, always compare a measure of location and a measure of spread, in context. For example "the median mark of class A (62) is higher than class B (55), and class A's smaller interquartile range (10 versus 18) shows its marks were more consistent."

Correlation and regression

Correlation measures how closely two variables follow a linear relationship; the product moment correlation coefficient $r$ runs from $-1$ to $1$ . The regression line of $y$ on $x$ is the best-fit line for predicting $y$ from $x$ . Use it only within the range of the data: predicting outside that range (extrapolation) is unreliable, and correlation alone does not prove that one variable causes the other.

Try this

Q1. A class width is $5$ and the frequency is $30$ . Find the frequency density. [1 mark]

Cue. $\dfrac{30}{5} = 6$ .

Q2. Data has mean $50$ and standard deviation $4$ . Using the two-standard-deviation rule, find the upper outlier boundary. [2 marks]

Cue. $50 + 2(4) = 58$ .

Exam-style practice questions

Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

OCR 20196 marksA data set of

50

values has

\sum x = 1500

and

\sum x^2 = 47\,000

. Find the mean and the standard deviation. An outlier is defined as more than two standard deviations from the mean; determine whether a value of

58

is an outlier.

Show worked answer →

Mean $\bar{x} = \dfrac{\sum x}{n} = \dfrac{1500}{50} = 30$ (M1, A1).

Variance $= \dfrac{\sum x^2}{n} - \bar{x}^2 = \dfrac{47\,000}{50} - 30^2 = 940 - 900 = 40$ (M1), so standard deviation $\sigma = \sqrt{40} \approx 6.32$ (A1).

Two standard deviations is $2(6.32) = 12.65$ , so the outlier boundary above the mean is $30 + 12.65 = 42.65$ (M1).

Since $58 > 42.65$ , the value $58$ is an outlier (A1).

Markers reward the mean, the variance formula, the standard deviation, the boundary, and the comparison.

OCR 20215 marksThe lengths of

80

leaves are recorded. The cumulative frequency reaches

20

6

cm,

40

7.5

cm and

60

9

cm. Estimate the median and the interquartile range, and comment on the skewness.

Show worked answer →

With $n = 80$ , the median is at the $40$ th value: from the data, the median $\approx 7.5$ cm (M1, A1).

The lower quartile is at the $20$ th value, $\approx 6$ cm, and the upper quartile at the $60$ th value, $\approx 9$ cm (M1).

Interquartile range $= 9 - 6 = 3$ cm (A1).

The median ( $7.5$ ) is exactly midway between the quartiles ( $6$ and $9$ ), so the distribution is roughly symmetrical (A1).

Markers reward reading the quartiles from the cumulative frequency, the interquartile range, and a justified comment on skewness.

Related dot points

Sources & how we know this

OCR A Level Mathematics A (H240) specification — OCR (2017)