How do cumulative frequency graphs and box plots show the median, quartiles and spread?
Cumulative frequency diagrams (discrete and grouped); estimating the median, quartiles and percentiles; box plots; comparing distributions using box plots and the interquartile range.
A focused answer to Edexcel GCSE Statistics on cumulative frequency diagrams and box plots, covering plotting cumulative frequency, estimating the median, quartiles and percentiles, drawing box plots, and comparing distributions using the median and interquartile range.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Edexcel codes 2a.03, 2c.01 and 2c.05 require you to draw and read cumulative frequency diagrams (discrete and grouped), estimate the median, quartiles and percentiles from them, draw and interpret box plots, and compare distributions using a median and a measure of spread. Cumulative frequency curves and box plots are among the most reliable sources of marks in the qualification, because the method is the same every time.
Cumulative frequency diagrams
The crucial plotting rule is to use the upper class boundary, because by the top of a class all of its values have been counted. The curve always rises and levels off at the total . Plotting against midpoints is a common and costly error.
Estimating median, quartiles and percentiles
From a cumulative frequency curve you read values horizontally across, then down:
- Median: go up to the th value, across to the curve, down to the axis.
- Lower quartile (): use the th value.
- Upper quartile (): use the th value.
- A percentile: the th percentile uses the th value (the th percentile uses ).
Edexcel uses , and on the cumulative frequency axis (not , which is for a small ordered list). Because the curve gives an estimate, slightly different sensible readings are accepted.
Box plots
A box plot (box and whisker diagram) summarises a distribution with five numbers: minimum, lower quartile, median, upper quartile and maximum. The box spans to (the middle of the data), a line inside marks the median, and the whiskers reach out to the minimum and maximum. The width of the box is the interquartile range , a measure of spread that ignores extreme values.
Comparing distributions
When two box plots (or two cumulative frequency curves) are compared, Edexcel expects two comparisons in context: one of an average (compare the medians) and one of spread (compare the IQRs or ranges). A higher median means higher values on average; a smaller IQR means more consistent values. Always finish with a sentence relating the comparison to the situation.
A box plot also reveals skewness at a glance. If the median sits closer to the lower quartile and the upper whisker is longer, the data has positive skew (a longer tail of high values); if the median is closer to the upper quartile with a longer lower whisker, the data has negative skew. A median in the centre of a symmetric box suggests a roughly symmetric distribution. Edexcel may ask you to comment on the shape as well as the average and spread, so look at where the median falls within the box.
Using a cumulative frequency graph to estimate proportions
Beyond the median and quartiles, a cumulative frequency curve lets you estimate how many values lie above or below a given figure. To find the number of values below a value , go up from on the horizontal axis to the curve and across to read the cumulative frequency; to find the number above , subtract that reading from the total . For example, with apples, if have a mass of g or less, then apples are heavier than g. Converting such a count to a percentage or a probability is a frequent follow-up.
Exam-style practice questions
Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Edexcel 1ST0 20194 marksThe cumulative frequency table for the masses, in grams, of apples is: g, ; g, ; g, ; g, ; g, . Use it to estimate (a) the median and (b) the interquartile range.Show worked answer β
With : the median is at the th value, the lower quartile at the th, and the upper quartile at the th.
Reading from a cumulative frequency curve: the th value falls in the class (cumulative reached after ), giving a median of about g. The th value gives g and the th gives g.
IQR g.
Markers reward using , and to locate the values, sensible readings from the curve, and the IQR as .
Edexcel 1ST0 20214 marksTwo box plots compare the daily sales, in GBP, of two shops. Shop A: minimum , , median , , maximum . Shop B: minimum , , median , , maximum . Compare the two distributions, referring to an average and a measure of spread.Show worked answer β
Average: Shop B has a higher median ( vs ), so on average Shop B takes more per day.
Spread: Shop A's IQR ; Shop B's IQR . Shop B's smaller IQR (and smaller range, vs ) means its sales are more consistent.
Conclusion: Shop B has higher and more consistent daily sales than Shop A.
Markers reward one comparison of an average (median), one comparison of spread (IQR or range) with values, and a conclusion in context.
Related dot points
- Histograms for continuous data with equal and unequal class widths; frequency density; using area to represent frequency; estimating frequencies within a class; correct use of class boundaries.
A focused answer to Edexcel GCSE Statistics on histograms, covering continuous data and class boundaries, equal and unequal class widths, frequency density, why area represents frequency, and estimating frequencies within a class interval at Higher tier.
- Bar charts (including multiple and composite), line graphs, frequency polygons, population pyramids and choropleth maps; representing, interpreting and comparing data sets shown graphically.
A focused answer to Edexcel GCSE Statistics on charts and graphs, covering simple, multiple and composite bar charts, line graphs, frequency polygons, population pyramids and choropleth maps, and how to interpret and compare data sets displayed graphically.
- Mode, median and mean for discrete and grouped data; estimating the mean of grouped data with midpoints; linear interpolation for the median; weighted and geometric mean; effect of changes and transformations on averages.
A focused answer to Edexcel GCSE Statistics on averages, covering mode, median and mean for discrete and grouped data, estimating the mean with class midpoints, linear interpolation for the median, weighted and geometric mean at Higher tier, and the effect of changes and transformations.
- Range, quartiles, interquartile range, percentiles, interpercentile and interdecile range; choosing an appropriate measure of spread; pairing a measure of spread with a measure of central tendency.
A focused answer to Edexcel GCSE Statistics on measures of spread, covering range, quartiles, interquartile range, percentiles, interpercentile and interdecile range, choosing an appropriate measure, and pairing a measure of spread with the right average.
Sources & how we know this
- Pearson Edexcel GCSE (9-1) Statistics (1ST0) specification β Pearson Edexcel (2017)