Skip to main content
EnglandStatisticsSyllabus dot point

How do you read medians and quartiles from grouped data?

Cumulative frequency tables and graphs, estimating the median and quartiles, and drawing and interpreting box plots.

A focused answer to AQA GCSE Statistics on cumulative frequency and box plots, covering cumulative frequency tables and graphs, estimating the median and quartiles, the interquartile range, drawing box plots and identifying outliers.

Generated by Claude Opus 4.810 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Cumulative frequency tables and graphs
  3. Estimating the median and quartiles
  4. Box plots and outliers

What this dot point is asking

AQA wants you to build a cumulative frequency table and graph, use it to estimate the median, quartiles and percentiles, calculate the interquartile range, and draw, read and compare box plots, including identifying outliers. Cumulative frequency and box plots are the standard tools for comparing two grouped data sets, so they appear on almost every paper.

Cumulative frequency tables and graphs

The reason you plot against the upper boundary is that the running total counts everything up to and including that boundary, so the point belongs at the top of the class, not the middle. The resulting curve is S-shaped (an ogive) and rises to the total nn at the end. It lets you estimate, for any value, how many data items fall below it, which is the basis for reading off the median and quartiles.

Estimating the median and quartiles

Note that for a graph you use n2\frac{n}{2}, n4\frac{n}{4} and 3n4\frac{3n}{4}, not the n+12\frac{n+1}{2} positions used for a short raw list. This is because a cumulative frequency curve treats the data as one continuous block of nn items rather than discrete points. To read a percentile, use the same idea: the ppth percentile is read at cumulative frequency p100×n\frac{p}{100} \times n.

Box plots and outliers

Box plots are ideal for comparing two distributions on one scale: compare medians for the typical value and the interquartile range (the box length) or range (the whisker span) for the spread, always in context. The position of the median line inside the box also signals skew: a median nearer the lower quartile suggests positive skew.

The two diagrams work together: the cumulative frequency curve is where you read the quartiles from grouped data, and the box plot is how you display and compare them. To draw a box plot you need the five-number summary (minimum, Q1Q_1, median, Q3Q_3, maximum), and the first three quartiles come straight off the curve at n4\frac{n}{4}, n2\frac{n}{2} and 3n4\frac{3n}{4}. A cumulative frequency curve can also answer "how many scored more than 4040?" type questions: read up from 4040 to the curve and across to find how many fall below, then subtract from nn for the number above. This makes the curve the workhorse for any "estimate the number/percentage above or below a value" question on grouped data.

Exam-style practice questions

Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AQA 20195 marksA cumulative frequency graph is drawn for the times of 6060 runners. Reading from the curve gives lower quartile 2424 minutes, median 3131 minutes and upper quartile 4040 minutes, with a minimum of 1818 and maximum of 5252. (a) Calculate the interquartile range. (b) State the five values needed to draw a box plot and identify any outlier using the 1.5×IQR1.5\times\text{IQR} rule.
Show worked answer →

(a) IQR=Q3−Q1=40−24=16\text{IQR} = Q_3 - Q_1 = 40 - 24 = 16 minutes.

(b) Box plot five-number summary: minimum 1818, Q1=24Q_1 = 24, median 3131, Q3=40Q_3 = 40, maximum 5252. Upper boundary =40+1.5×16=64= 40 + 1.5\times 16 = 64; lower boundary =24−1.5×16=0= 24 - 1.5\times 16 = 0. The maximum 5252 is below 6464, so there is no upper outlier.

Markers reward the interquartile range, the correct five-number summary, the boundary calculation, and a clear statement that 5252 is not an outlier.

AQA 20214 marksTwo box plots show the marks of Class P (median 5454, IQR 1818) and Class Q (median 6161, IQR 1010). Compare the two classes.
Show worked answer →

Average in context: Class Q has the higher median (6161 versus 5454), so Class Q typically scored higher.

Spread in context: Class Q has the smaller interquartile range (1010 versus 1818), so Class Q's marks were more consistent.

Markers reward one average comparison and one spread comparison, both phrased in context (not bare numbers). Comparing two medians with no spread, or vice versa, loses marks.

Related dot points

Sources & how we know this