Skip to main content
EnglandStatisticsSyllabus dot point

How do you measure how spread out a data set is?

Range, interquartile range, percentiles, the effect of outliers, and choosing a measure of spread.

A focused answer to AQA GCSE Statistics on measures of spread, covering the range, interquartile range, percentiles, how outliers affect spread, and how to choose a suitable measure of spread.

Generated by Claude Opus 4.88 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. The range
  3. The interquartile range
  4. Percentiles
  5. The effect of outliers and choosing a measure

What this dot point is asking

AQA wants you to measure the spread of data using the range, interquartile range and percentiles, understand how outliers affect each measure, and choose a suitable measure of spread for a situation. Spread questions almost always pair with an average: examiners reward candidates who quote one average and one spread, both in context.

The range

The range is the crudest measure of spread because a single unusually large or small value sets it entirely. It is fine for small, clean data sets, or for stating the full extent of variation (for example the range of daily temperatures), but it tells you nothing about how the bulk of the data is distributed.

The interquartile range

Because it cuts off the bottom and top quarters, the interquartile range is resistant to outliers, which makes it the standard companion to the median for skewed data. A small interquartile range means the central half of the data is tightly grouped; a large one means even the middle values are widely spread. On AQA box plot and cumulative frequency questions, the interquartile range is read directly from Q1Q_1 and Q3Q_3.

Percentiles

To find the position of the ppth percentile in a list of nn ordered values, AQA uses p100Γ—n\frac{p}{100} \times n. For n=80n = 80 the 9090th percentile is at position 0.9Γ—80=720.9 \times 80 = 72, so you read off the 7272nd value (or interpolate from a cumulative frequency graph for grouped data). Interpercentile ranges are popular in exams because they let you describe spread while deliberately discarding the volatile tails.

The effect of outliers and choosing a measure

The choice rule examiners reward: use the range for a quick measure of total spread on clean data, the interquartile range when there are outliers or the data is skewed (pair it with the median), and the standard deviation when the data is roughly symmetrical and you want a measure that uses every value (pair it with the mean).

The reason this matters is that a measure of spread is only useful if it reflects the typical variation rather than one freak value. The range uses only the two most extreme values, so it is the least robust. The interquartile range and interpercentile ranges deliberately discard the tails, so they are robust but ignore some information. The standard deviation uses every value, so it is the most informative, but it pays for that by being sensitive to outliers. Choosing the right one, and pairing it with the matching average, is exactly the judgement examiners test in "compare the data" questions, where you must quote one average and one spread and justify the choice in context.

Exam-style practice questions

Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AQA 20194 marksThe waiting times (minutes) at a clinic were 3,5,6,8,9,11,14,383, 5, 6, 8, 9, 11, 14, 38. (a) Calculate the range. (b) The lower quartile is 5.55.5 and the upper quartile is 12.512.5. Calculate the interquartile range and explain why it is a better measure of spread for this data.
Show worked answer β†’

(a) Range =38βˆ’3=35= 38 - 3 = 35 minutes.

(b) Interquartile range =Q3βˆ’Q1=12.5βˆ’5.5=7= Q_3 - Q_1 = 12.5 - 5.5 = 7 minutes.

The value 3838 is an outlier that inflates the range to 3535, far larger than the typical spread. The interquartile range uses only the middle half of the data, so it ignores the 3838 and better describes the spread of a usual visit.

Markers reward the two calculations and a clear, contextual reason that the interquartile range is resistant to the outlier.

AQA 20223 marksA data set of 8080 values has its 1010th percentile at 2222 and its 9090th percentile at 5858. (a) State how many values lie below the 1010th percentile. (b) Calculate the 1010th to 9090th interpercentile range.
Show worked answer β†’

(a) The 1010th percentile has 10%10\% of the data below it: 0.10Γ—80=80.10 \times 80 = 8 values.

(b) Interpercentile range =58βˆ’22=36= 58 - 22 = 36.

Markers reward 88 for part (a) (using 10%10\% of nn) and the subtraction 58βˆ’2258 - 22 for part (b). This range ignores the most extreme tenth at each end, so it is robust like the interquartile range but uses a wider central band.

Related dot points

Sources & how we know this