How do you calculate and choose the right average for a data set?
Mode, median and mean for discrete and grouped data; estimating the mean of grouped data with midpoints; linear interpolation for the median; weighted and geometric mean; effect of changes and transformations on averages.
A focused answer to Edexcel GCSE Statistics on averages, covering mode, median and mean for discrete and grouped data, estimating the mean with class midpoints, linear interpolation for the median, weighted and geometric mean at Higher tier, and the effect of changes and transformations.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Edexcel codes 2b.01 to 2b.03 require you to calculate the mode, median and arithmetic mean for discrete and grouped data, estimate the mean of grouped data using class midpoints, find the median of grouped data by linear interpolation, and (Higher tier) calculate the weighted mean and geometric mean. You must also justify which average suits a context and understand the effect on each average of adding or removing a value, or of simple transformations (scaling and translating the data).
Mode, median and mean
For a small ordered list of values, the median is at position . For frequency data the mean is
The most common error is dividing by the number of categories instead of , the total frequency.
Grouped data: estimating the mean
For grouped data you do not know the individual values, so you assume each lies at its class midpoint . The mean is then estimated as using those midpoints. Because of this assumption the answer is an estimate, and Edexcel regularly asks you to state why. Use equal class widths at Foundation; Higher tier may use unequal widths.
Grouped data: median by interpolation
The median of grouped data is found by linear interpolation within the class that contains the middle value. Locate the th value, find which class it falls in, then assume the values are evenly spread across that class:
where is the lower boundary of the median class, the cumulative frequency before it, the median class frequency and its width.
Weighted and geometric mean
The weighted mean gives different importance to different values:
It is used when components count unequally (for example a grade made of exam and coursework). The geometric mean of values is the th root of their product, , and suits averaging growth rates or ratios, where the arithmetic mean would mislead.
Effect of changes and transformations
Edexcel expects you to predict how averages respond to changes. Adding a value above the mean raises the mean; removing the largest value lowers the mean and may change the median. Under a transformation (Higher tier), if every value is multiplied by and then increased by , the mean, mode and median are all transformed the same way: new average .
Exam-style practice questions
Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Edexcel 1ST0 20194 marksThe table shows the number of goals scored in matches. goals: matches; : ; : ; : ; : . Calculate the mean number of goals per match.Show worked answer →
Mean where is the number of goals and the frequency.
.
, so mean goals per match.
Markers reward the total, dividing by the total frequency , and the answer . A common slip is dividing by (the number of categories) instead of .
Edexcel 1ST0 20224 marksThe table shows the heights, in cm, of plants. : ; : ; : ; : . Estimate the mean height, and explain why your answer is only an estimate.Show worked answer →
Use class midpoints .
.
, so estimated mean cm.
It is only an estimate because the data is grouped: the exact heights are unknown, so each value is assumed to lie at its class midpoint.
Markers reward using midpoints, the total, the mean cm, and the reason (grouped data, midpoints assumed).
Related dot points
- Range, quartiles, interquartile range, percentiles, interpercentile and interdecile range; choosing an appropriate measure of spread; pairing a measure of spread with a measure of central tendency.
A focused answer to Edexcel GCSE Statistics on measures of spread, covering range, quartiles, interquartile range, percentiles, interpercentile and interdecile range, choosing an appropriate measure, and pairing a measure of spread with the right average.
- Standard deviation for a set of values and for grouped data; using the mean and standard deviation to compare data sets; standardising values with the standardised score to compare across distributions.
A focused answer to Edexcel GCSE Statistics (Higher tier) on standard deviation and standardised scores, covering the standard deviation formulae for a set of values and grouped data, comparing data sets with the mean and standard deviation, and standardising values to compare across distributions.
- Skewness by inspection and by calculation; interpreting positive and negative skew; identifying outliers by inspection and using the quartile and standard deviation rules; commenting on outliers in context.
A focused answer to Edexcel GCSE Statistics on skewness and outliers, covering determining skewness by inspection and the skewness formula, interpreting positive and negative skew, identifying outliers using the quartile and standard deviation rules, and commenting on outliers in context.
- Cumulative frequency diagrams (discrete and grouped); estimating the median, quartiles and percentiles; box plots; comparing distributions using box plots and the interquartile range.
A focused answer to Edexcel GCSE Statistics on cumulative frequency diagrams and box plots, covering plotting cumulative frequency, estimating the median, quartiles and percentiles, drawing box plots, and comparing distributions using the median and interquartile range.
Sources & how we know this
- Pearson Edexcel GCSE (9-1) Statistics (1ST0) specification — Pearson Edexcel (2017)