Skip to main content
EnglandStatisticsSyllabus dot point

How do cumulative frequency graphs and box plots show the median, quartiles and spread?

Cumulative frequency diagrams (discrete and grouped); estimating the median, quartiles and percentiles; box plots; comparing distributions using box plots and the interquartile range.

A focused answer to Edexcel GCSE Statistics on cumulative frequency diagrams and box plots, covering plotting cumulative frequency, estimating the median, quartiles and percentiles, drawing box plots, and comparing distributions using the median and interquartile range.

Generated by Claude Opus 4.810 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Cumulative frequency diagrams
  3. Estimating median, quartiles and percentiles
  4. Box plots
  5. Comparing distributions
  6. Using a cumulative frequency graph to estimate proportions

What this dot point is asking

Edexcel codes 2a.03, 2c.01 and 2c.05 require you to draw and read cumulative frequency diagrams (discrete and grouped), estimate the median, quartiles and percentiles from them, draw and interpret box plots, and compare distributions using a median and a measure of spread. Cumulative frequency curves and box plots are among the most reliable sources of marks in the qualification, because the method is the same every time.

Cumulative frequency diagrams

The crucial plotting rule is to use the upper class boundary, because by the top of a class all of its values have been counted. The curve always rises and levels off at the total nn. Plotting against midpoints is a common and costly error.

Estimating median, quartiles and percentiles

From a cumulative frequency curve you read values horizontally across, then down:

  • Median: go up to the n2\frac{n}{2}th value, across to the curve, down to the axis.
  • Lower quartile (Q1Q_1): use the n4\frac{n}{4}th value.
  • Upper quartile (Q3Q_3): use the 3n4\frac{3n}{4}th value.
  • A percentile: the ppth percentile uses the p100Γ—n\frac{p}{100} \times nth value (the 9090th percentile uses 0.9n0.9n).

Edexcel uses n2\frac{n}{2}, n4\frac{n}{4} and 3n4\frac{3n}{4} on the cumulative frequency axis (not n+12\frac{n+1}{2}, which is for a small ordered list). Because the curve gives an estimate, slightly different sensible readings are accepted.

Box plots

A box plot (box and whisker diagram) summarises a distribution with five numbers: minimum, lower quartile, median, upper quartile and maximum. The box spans Q1Q_1 to Q3Q_3 (the middle 50%50\% of the data), a line inside marks the median, and the whiskers reach out to the minimum and maximum. The width of the box is the interquartile range IQR=Q3βˆ’Q1IQR = Q_3 - Q_1, a measure of spread that ignores extreme values.

Comparing distributions

When two box plots (or two cumulative frequency curves) are compared, Edexcel expects two comparisons in context: one of an average (compare the medians) and one of spread (compare the IQRs or ranges). A higher median means higher values on average; a smaller IQR means more consistent values. Always finish with a sentence relating the comparison to the situation.

A box plot also reveals skewness at a glance. If the median sits closer to the lower quartile and the upper whisker is longer, the data has positive skew (a longer tail of high values); if the median is closer to the upper quartile with a longer lower whisker, the data has negative skew. A median in the centre of a symmetric box suggests a roughly symmetric distribution. Edexcel may ask you to comment on the shape as well as the average and spread, so look at where the median falls within the box.

Using a cumulative frequency graph to estimate proportions

Beyond the median and quartiles, a cumulative frequency curve lets you estimate how many values lie above or below a given figure. To find the number of values below a value vv, go up from vv on the horizontal axis to the curve and across to read the cumulative frequency; to find the number above vv, subtract that reading from the total nn. For example, with n=200n = 200 apples, if 150150 have a mass of 140140 g or less, then 200βˆ’150=50200 - 150 = 50 apples are heavier than 140140 g. Converting such a count to a percentage or a probability is a frequent follow-up.

Exam-style practice questions

Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Edexcel 1ST0 20194 marksThe cumulative frequency table for the masses, in grams, of 8080 apples is: ≀100\le 100 g, 88; ≀120\le 120 g, 2626; ≀140\le 140 g, 5454; ≀160\le 160 g, 7272; ≀180\le 180 g, 8080. Use it to estimate (a) the median and (b) the interquartile range.
Show worked answer β†’

With n=80n = 80: the median is at the 802=40\frac{80}{2} = 40th value, the lower quartile at the 804=20\frac{80}{4} = 20th, and the upper quartile at the 3Γ—804=60\frac{3 \times 80}{4} = 60th.

Reading from a cumulative frequency curve: the 4040th value falls in the ≀140\le 140 class (cumulative 5454 reached after 2626), giving a median of about 132132 g. The 2020th value gives LQβ‰ˆ116LQ \approx 116 g and the 6060th gives UQβ‰ˆ145UQ \approx 145 g.

IQR =UQβˆ’LQβ‰ˆ145βˆ’116=29= UQ - LQ \approx 145 - 116 = 29 g.

Markers reward using n2\frac{n}{2}, n4\frac{n}{4} and 3n4\frac{3n}{4} to locate the values, sensible readings from the curve, and the IQR as UQβˆ’LQUQ - LQ.

Edexcel 1ST0 20214 marksTwo box plots compare the daily sales, in GBP, of two shops. Shop A: minimum 2020, LQLQ 4040, median 5555, UQUQ 7070, maximum 100100. Shop B: minimum 3030, LQLQ 5050, median 6060, UQUQ 6666, maximum 9090. Compare the two distributions, referring to an average and a measure of spread.
Show worked answer β†’

Average: Shop B has a higher median (6060 vs 5555), so on average Shop B takes more per day.

Spread: Shop A's IQR =70βˆ’40=30= 70 - 40 = 30; Shop B's IQR =66βˆ’50=16= 66 - 50 = 16. Shop B's smaller IQR (and smaller range, 6060 vs 8080) means its sales are more consistent.

Conclusion: Shop B has higher and more consistent daily sales than Shop A.

Markers reward one comparison of an average (median), one comparison of spread (IQR or range) with values, and a conclusion in context.

Related dot points

Sources & how we know this