Skip to main content
EnglandStatisticsSyllabus dot point

How do you compare two data sets fairly?

Comparing distributions using an average and a measure of spread, skewness, and writing comparisons in context.

A focused answer to AQA GCSE Statistics on comparing distributions, covering how to compare two data sets using an average and a measure of spread, describe skewness from the mean, median and mode, and write comparisons in context.

Generated by Claude Opus 4.88 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Comparing using an average and a measure of spread
  3. Describing skewness
  4. Writing comparisons in context

What this dot point is asking

AQA wants you to compare two data sets correctly by using one average and one measure of spread, describe the skewness of a distribution, and always write your comparison in the context of the real situation. "Compare" questions are reliable mark-earners on both papers, but only if you follow the structure examiners expect.

Comparing using an average and a measure of spread

The single most important rule is one of each. Quoting two averages, or two spreads, scores at most half the marks because you have not described both the centre and the variability. A comparison of centre alone cannot tell whether one group is more reliable, and a comparison of spread alone cannot tell which group is typically higher, so both are needed for a complete picture. Match the measures sensibly: if one data set has outliers, compare medians and interquartile ranges (both resist outliers); if both are roughly symmetrical, comparing means and standard deviations is fine. For example: "Class A has a higher median mark, so its students typically scored better, but Class A also has a larger interquartile range, so its results were less consistent." That sentence earns the average mark, the spread mark, and the context marks in one go.

Describing skewness

A reliable memory aid: the mean is dragged toward the long tail. If the tail is on the high (right) side, the mean is pulled up above the median, giving positive skew. If the tail is on the low (left) side, the mean is pulled down below the median, giving negative skew. You can also read skew from a box plot: if the median line sits closer to the lower quartile (the right whisker is longer), the distribution is positively skewed; if the median sits closer to the upper quartile, it is negatively skewed.

The order of the three averages is a quick test you can quote in an exam. For a positively skewed distribution the order is mode, then median, then mean (mode lowest, mean highest); for a negatively skewed distribution the order reverses to mean, then median, then mode; and for a symmetrical distribution all three coincide. Real data often shows skew because of natural limits: salaries, house prices and reaction times cannot fall below zero but have no ceiling, so they pile up at the low end with a long tail of high values, producing positive skew. Recognising this lets you predict the skew before any calculation and choose the median and interquartile range as the fairer summary.

Writing comparisons in context

Exam-style practice questions

Practice questions written in the style of AQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AQA 20194 marksTwo classes sat the same test. Class A: median 5858, interquartile range 1212. Class B: median 6464, interquartile range 2020. Compare the performance of the two classes.
Show worked answer →

Compare the averages in context: Class B has the higher median (6464 versus 5858), so Class B typically scored higher on the test.

Compare the spread in context: Class A has the smaller interquartile range (1212 versus 2020), so Class A's results were more consistent.

Markers award one mark for an average comparison, one for a spread comparison, and further marks for stating both in context (not as bare numbers). A comparison using two averages or two spreads loses marks.

AQA 20213 marksA distribution of salaries has mean £34,000\pounds34{,}000, median £29,000\pounds29{,}000 and mode £26,000\pounds26{,}000. (a) Describe the skew. (b) Explain what the skew tells you about the salaries.
Show worked answer →

(a) The mean exceeds the median, which exceeds the mode (34,000>29,000>26,00034{,}000 > 29{,}000 > 26{,}000), so the distribution is positively skewed.

(b) A positive skew means there is a long tail of high salaries: most people earn around the mode/median, but a few very high earners pull the mean upward.

Markers reward identifying positive skew from the order of the averages and a contextual interpretation of the high-value tail.

Related dot points

Sources & how we know this