Skip to main content
ScotlandApplications of MathematicsSyllabus dot point

How do you choose, draw and read statistical diagrams, and sample data without bias?

Selecting and interpreting statistical diagrams, comparing data sets using measures of centre and spread, identifying outliers and misleading graphs, and choosing an appropriate sampling method.

A focused answer to the SQA Higher Applications of Mathematics content on statistical diagrams and sampling, covering box plots and histograms, comparing distributions, outliers, misleading graphs, data types, and sampling methods.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Choosing a statistical diagram
  3. Comparing data sets
  4. Outliers
  5. Data types, misleading graphs and sampling
  6. Try this

What this dot point is asking

The SQA wants you to choose the right diagram for a data set, read and compare distributions using centre and spread, spot outliers and misleading graphs, recognise the type of data, and pick a sampling method that avoids bias. This is the descriptive-statistics foundation that the project and the inferential topics build on.

Choosing a statistical diagram

Different diagrams suit different data. A bar chart compares categories with gaps between bars. A histogram shows the distribution of grouped continuous data, with bars touching and area representing frequency. A box plot summarises a distribution with the five-number summary (minimum, lower quartile, median, upper quartile, maximum) and is ideal for comparing two or more groups side by side. A scatter plot shows the relationship between two variables.

Comparing data sets

When comparing two distributions, the SQA expects one statement about centre and one about spread, both in context.

  • Centre: the median (middle value, resistant to outliers) or the mean (average, affected by outliers).
  • Spread: the interquartile range IQR=Q3Q1\text{IQR} = Q_3 - Q_1 (the middle 50%50\%, resistant to outliers) or the standard deviation (typical distance from the mean).

A smaller spread means more consistent data. A higher centre means larger values on average. Phrasing matters: say "class A scored higher on average" and "class A's marks were more consistent", not just the bare numbers.

Outliers

An outlier is a value far from the rest of the data. The standard rule uses the interquartile range.

Compute the two boundaries and check whether the value lies outside them. Outliers may be genuine extreme cases or data-entry errors, and they pull the mean and standard deviation but barely move the median and IQR, which is why those resistant measures are often preferred.

Data types, misleading graphs and sampling

Data types
Data is quantitative (numerical, either discrete counts or continuous measurements) or qualitative (categories). The type guides which diagram and which average suit it.
Misleading graphs
Be alert to a vertical axis that does not start at zero (exaggerating differences), unequal class widths in a histogram, or a truncated or distorted scale. Recognising these is examinable.
Sampling
When you cannot survey a whole population you take a sample, and the method must avoid bias so the sample represents the population.

Try this

Q1. A data set has Q1=22Q_1 = 22 and Q3=34Q_3 = 34. Find the IQR and the upper outlier boundary. [2 marks]

  • Cue. IQR=12\text{IQR} = 12; upper boundary =34+1.5×12=52= 34 + 1.5 \times 12 = 52.

Q2. Two teams' scores have medians 1818 and 2424 and IQRs 99 and 44. Compare them in context. [2 marks]

  • Cue. The second team scores higher on average (median 2424) and is more consistent (IQR 44).

Q3. A school wants a sample reflecting its year groups, which differ in size. Name a suitable sampling method and why. [2 marks]

  • Cue. Stratified sampling, because it samples each year group in proportion to its size, representing all groups fairly.

Exam-style practice questions

Practice questions written in the style of SQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

SQA Higher Apps style5 marksA data set has lower quartile 1414, median 1919 and upper quartile 2525. Find the interquartile range, and use the 1.5×IQR1.5 \times \text{IQR} rule to decide whether a value of 4242 is an outlier.
Show worked answer →

The interquartile range is IQR=Q3Q1=2514=11\text{IQR} = Q_3 - Q_1 = 25 - 14 = 11 (1 mark).

The upper outlier boundary is Q3+1.5×IQR=25+1.5×11=25+16.5=41.5Q_3 + 1.5 \times \text{IQR} = 25 + 1.5 \times 11 = 25 + 16.5 = 41.5 (2 marks).

Since 42>41.542 > 41.5, the value lies beyond the upper boundary, so it is an outlier (2 marks). Markers reward the correct IQR, the 1.5×IQR1.5 \times \text{IQR} boundary, and a clear decision against that boundary.

SQA Higher Apps style4 marksTwo classes sit the same test. Class A has median 6262 and IQR 1010; class B has median 5858 and IQR 2222. Compare the two distributions in context.
Show worked answer →

Class A has the higher median, 6262 against 5858, so on average class A scored slightly higher (2 marks).

Class A has the smaller IQR, 1010 against 2222, so class A's marks are more consistent and tightly clustered, while class B's marks are more spread out (2 marks). Markers reward one comparison of centre using the medians and one comparison of spread using the IQRs, both written in the context of test marks.

Related dot points

Sources & how we know this