How do you describe the shape of a distribution and decide whether a value is an outlier?
Skewness by inspection and by calculation; interpreting positive and negative skew; identifying outliers by inspection and using the quartile and standard deviation rules; commenting on outliers in context.
A focused answer to Edexcel GCSE Statistics on skewness and outliers, covering determining skewness by inspection and the skewness formula, interpreting positive and negative skew, identifying outliers using the quartile and standard deviation rules, and commenting on outliers in context.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Edexcel codes 2a.09, 2a.10, 2c.02 and 2c.03 require you to determine skewness by inspection and (Higher tier) by calculation, to interpret positive and negative skew, to identify outliers by inspection and using the quartile and standard deviation rules, and to comment on outliers with reference to the original data. These ideas tie together the shape of a distribution and how its average and spread should be chosen.
Skewness by inspection
By inspection you can detect skew from the order of the averages and from the quartiles:
- Positive skew: , and (more spread above the median).
- Negative skew: , and (more spread below the median).
On a box plot, a longer right whisker and a median nearer signal positive skew.
Skewness by calculation
At Higher tier you calculate skewness with the formula on the formulae sheet:
A positive result means positive skew, a negative result means negative skew, and a result near zero means roughly symmetric. The sign matters more than the exact size at GCSE; always state the type of skew and describe the longer tail.
Interpreting skew
Edexcel code 2a.10 wants interpretation, not just classification. For positive skew, values above the median are more spread out than those below, so a few high values pull the mean up; this is why the median is often the better average for positively skewed data such as incomes or house prices. Linking the skew to the choice of average is a frequent extended-question theme.
Identifying outliers
You may need to calculate the boundaries first, then compare. The quartile rule pairs with median and IQR; the rule pairs with mean and standard deviation.
Commenting on outliers
Code 2c.03 stresses commenting on an outlier in context: it may be a genuine unusual value (a real, important extreme) or the result of a recording error (a typo, wrong units). You should not delete an outlier automatically; decide whether to keep, correct or exclude it based on the context, and explain your reasoning.
Exam-style practice questions
Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Edexcel 1ST0 20204 marksA data set has mean , median and standard deviation . (a) Calculate the skewness using the formula . (b) State the type of skew and describe what it tells you about the distribution.Show worked answer →
(a) Skewness .
(b) The skewness is positive, so the distribution has positive skew. The mean is greater than the median, the tail stretches to the higher values, and values above the median are more spread out than those below.
Markers reward correct substitution into the formula, the value , identifying positive skew, and a description of the longer upper tail.
Edexcel 1ST0 20224 marksA data set has lower quartile , upper quartile and a value of . (a) Using the rule that an outlier is more than beyond a quartile, determine whether is an outlier. (b) State two possible reasons an outlier might occur.Show worked answer →
(a) , so . The upper boundary is . Since , the value is an outlier.
(b) Two reasons, for example: it is a genuine unusual value (a real extreme observation), or it is the result of an error in recording or entering the data.
Markers reward calculating the IQR and the upper boundary , the conclusion that is an outlier, and two valid reasons.
Related dot points
- Mode, median and mean for discrete and grouped data; estimating the mean of grouped data with midpoints; linear interpolation for the median; weighted and geometric mean; effect of changes and transformations on averages.
A focused answer to Edexcel GCSE Statistics on averages, covering mode, median and mean for discrete and grouped data, estimating the mean with class midpoints, linear interpolation for the median, weighted and geometric mean at Higher tier, and the effect of changes and transformations.
- Range, quartiles, interquartile range, percentiles, interpercentile and interdecile range; choosing an appropriate measure of spread; pairing a measure of spread with a measure of central tendency.
A focused answer to Edexcel GCSE Statistics on measures of spread, covering range, quartiles, interquartile range, percentiles, interpercentile and interdecile range, choosing an appropriate measure, and pairing a measure of spread with the right average.
- Standard deviation for a set of values and for grouped data; using the mean and standard deviation to compare data sets; standardising values with the standardised score to compare across distributions.
A focused answer to Edexcel GCSE Statistics (Higher tier) on standard deviation and standardised scores, covering the standard deviation formulae for a set of values and grouped data, comparing data sets with the mean and standard deviation, and standardising values to compare across distributions.
- Cumulative frequency diagrams (discrete and grouped); estimating the median, quartiles and percentiles; box plots; comparing distributions using box plots and the interquartile range.
A focused answer to Edexcel GCSE Statistics on cumulative frequency diagrams and box plots, covering plotting cumulative frequency, estimating the median, quartiles and percentiles, drawing box plots, and comparing distributions using the median and interquartile range.
Sources & how we know this
- Pearson Edexcel GCSE (9-1) Statistics (1ST0) specification — Pearson Edexcel (2017)