Skip to main content
EnglandStatisticsSyllabus dot point

How do you describe the shape of a distribution and decide whether a value is an outlier?

Skewness by inspection and by calculation; interpreting positive and negative skew; identifying outliers by inspection and using the quartile and standard deviation rules; commenting on outliers in context.

A focused answer to Edexcel GCSE Statistics on skewness and outliers, covering determining skewness by inspection and the skewness formula, interpreting positive and negative skew, identifying outliers using the quartile and standard deviation rules, and commenting on outliers in context.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Skewness by inspection
  3. Skewness by calculation
  4. Interpreting skew
  5. Identifying outliers
  6. Commenting on outliers

What this dot point is asking

Edexcel codes 2a.09, 2a.10, 2c.02 and 2c.03 require you to determine skewness by inspection and (Higher tier) by calculation, to interpret positive and negative skew, to identify outliers by inspection and using the quartile and standard deviation rules, and to comment on outliers with reference to the original data. These ideas tie together the shape of a distribution and how its average and spread should be chosen.

Skewness by inspection

By inspection you can detect skew from the order of the averages and from the quartiles:

  • Positive skew: mean>median>mode\text{mean} > \text{median} > \text{mode}, and medianQ1<Q3median\text{median} - Q_1 < Q_3 - \text{median} (more spread above the median).
  • Negative skew: mean<median<mode\text{mean} < \text{median} < \text{mode}, and medianQ1>Q3median\text{median} - Q_1 > Q_3 - \text{median} (more spread below the median).

On a box plot, a longer right whisker and a median nearer Q1Q_1 signal positive skew.

Skewness by calculation

At Higher tier you calculate skewness with the formula on the formulae sheet:

A positive result means positive skew, a negative result means negative skew, and a result near zero means roughly symmetric. The sign matters more than the exact size at GCSE; always state the type of skew and describe the longer tail.

Interpreting skew

Edexcel code 2a.10 wants interpretation, not just classification. For positive skew, values above the median are more spread out than those below, so a few high values pull the mean up; this is why the median is often the better average for positively skewed data such as incomes or house prices. Linking the skew to the choice of average is a frequent extended-question theme.

Identifying outliers

You may need to calculate the boundaries first, then compare. The quartile rule pairs with median and IQR; the μ±3σ\mu \pm 3\sigma rule pairs with mean and standard deviation.

Commenting on outliers

Code 2c.03 stresses commenting on an outlier in context: it may be a genuine unusual value (a real, important extreme) or the result of a recording error (a typo, wrong units). You should not delete an outlier automatically; decide whether to keep, correct or exclude it based on the context, and explain your reasoning.

Exam-style practice questions

Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Edexcel 1ST0 20204 marksA data set has mean 5252, median 4848 and standard deviation 1010. (a) Calculate the skewness using the formula 3(meanmedian)standard deviation\frac{3(\text{mean} - \text{median})}{\text{standard deviation}}. (b) State the type of skew and describe what it tells you about the distribution.
Show worked answer →

(a) Skewness =3(5248)10=3×410=1210=1.2= \frac{3(52 - 48)}{10} = \frac{3 \times 4}{10} = \frac{12}{10} = 1.2.

(b) The skewness is positive, so the distribution has positive skew. The mean is greater than the median, the tail stretches to the higher values, and values above the median are more spread out than those below.

Markers reward correct substitution into the formula, the value 1.21.2, identifying positive skew, and a description of the longer upper tail.

Edexcel 1ST0 20224 marksA data set has lower quartile 3030, upper quartile 5050 and a value of 8585. (a) Using the rule that an outlier is more than 1.5×IQR1.5 \times IQR beyond a quartile, determine whether 8585 is an outlier. (b) State two possible reasons an outlier might occur.
Show worked answer →

(a) IQR=5030=20IQR = 50 - 30 = 20, so 1.5×IQR=301.5 \times IQR = 30. The upper boundary is UQ+1.5×IQR=50+30=80UQ + 1.5 \times IQR = 50 + 30 = 80. Since 85>8085 > 80, the value 8585 is an outlier.

(b) Two reasons, for example: it is a genuine unusual value (a real extreme observation), or it is the result of an error in recording or entering the data.

Markers reward calculating the IQR and the upper boundary 8080, the conclusion that 8585 is an outlier, and two valid reasons.

Related dot points

Sources & how we know this