How do you choose, draw and read statistical diagrams, and sample data without bias?
Selecting and interpreting statistical diagrams, comparing data sets using measures of centre and spread, identifying outliers and misleading graphs, and choosing an appropriate sampling method.
A focused answer to the SQA Higher Applications of Mathematics content on statistical diagrams and sampling, covering box plots and histograms, comparing distributions, outliers, misleading graphs, data types, and sampling methods.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
The SQA wants you to choose the right diagram for a data set, read and compare distributions using centre and spread, spot outliers and misleading graphs, recognise the type of data, and pick a sampling method that avoids bias. This is the descriptive-statistics foundation that the project and the inferential topics build on.
Choosing a statistical diagram
Different diagrams suit different data. A bar chart compares categories with gaps between bars. A histogram shows the distribution of grouped continuous data, with bars touching and area representing frequency. A box plot summarises a distribution with the five-number summary (minimum, lower quartile, median, upper quartile, maximum) and is ideal for comparing two or more groups side by side. A scatter plot shows the relationship between two variables.
Comparing data sets
When comparing two distributions, the SQA expects one statement about centre and one about spread, both in context.
- Centre: the median (middle value, resistant to outliers) or the mean (average, affected by outliers).
- Spread: the interquartile range (the middle , resistant to outliers) or the standard deviation (typical distance from the mean).
A smaller spread means more consistent data. A higher centre means larger values on average. Phrasing matters: say "class A scored higher on average" and "class A's marks were more consistent", not just the bare numbers.
Outliers
An outlier is a value far from the rest of the data. The standard rule uses the interquartile range.
Compute the two boundaries and check whether the value lies outside them. Outliers may be genuine extreme cases or data-entry errors, and they pull the mean and standard deviation but barely move the median and IQR, which is why those resistant measures are often preferred.
Data types, misleading graphs and sampling
- Data types
- Data is quantitative (numerical, either discrete counts or continuous measurements) or qualitative (categories). The type guides which diagram and which average suit it.
- Misleading graphs
- Be alert to a vertical axis that does not start at zero (exaggerating differences), unequal class widths in a histogram, or a truncated or distorted scale. Recognising these is examinable.
- Sampling
- When you cannot survey a whole population you take a sample, and the method must avoid bias so the sample represents the population.
Try this
Q1. A data set has and . Find the IQR and the upper outlier boundary. [2 marks]
- Cue. ; upper boundary .
Q2. Two teams' scores have medians and and IQRs and . Compare them in context. [2 marks]
- Cue. The second team scores higher on average (median ) and is more consistent (IQR ).
Q3. A school wants a sample reflecting its year groups, which differ in size. Name a suitable sampling method and why. [2 marks]
- Cue. Stratified sampling, because it samples each year group in proportion to its size, representing all groups fairly.
Exam-style practice questions
Practice questions written in the style of SQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
SQA Higher Apps style5 marksA data set has lower quartile , median and upper quartile . Find the interquartile range, and use the rule to decide whether a value of is an outlier.Show worked answer →
The interquartile range is (1 mark).
The upper outlier boundary is (2 marks).
Since , the value lies beyond the upper boundary, so it is an outlier (2 marks). Markers reward the correct IQR, the boundary, and a clear decision against that boundary.
SQA Higher Apps style4 marksTwo classes sit the same test. Class A has median and IQR ; class B has median and IQR . Compare the two distributions in context.Show worked answer →
Class A has the higher median, against , so on average class A scored slightly higher (2 marks).
Class A has the smaller IQR, against , so class A's marks are more consistent and tightly clustered, while class B's marks are more spread out (2 marks). Markers reward one comparison of centre using the medians and one comparison of spread using the IQRs, both written in the context of test marks.
Related dot points
- Measuring linear association with Pearson's correlation coefficient, fitting a simple linear regression line, interpreting its slope and intercept, and using it to predict while distinguishing interpolation from extrapolation.
A focused answer to the SQA Higher Applications of Mathematics content on correlation and regression, covering Pearson's r, the strength and direction of association, the least-squares regression line, interpreting slope and intercept, prediction, and correlation versus causation.
- Carrying out and interpreting hypothesis tests (t-tests and z-tests), using the p-value and significance level to reach a conclusion, constructing and interpreting confidence intervals, and recognising errors in statistical testing.
A focused answer to the SQA Higher Applications of Mathematics inferential statistics content, covering null and alternative hypotheses, p-values and significance levels, t-tests and z-tests, confidence intervals, and errors in statistical testing.
- Calculating probabilities of single and combined events using the addition and multiplication rules and tree diagrams, working with conditional probability, and finding the expected value of a situation with uncertain outcomes.
A focused answer to the SQA Higher Applications of Mathematics probability content, covering basic probability, combining events with the addition and multiplication rules, tree diagrams, conditional probability, and calculating expected value.
- Using a spreadsheet to support modelling: entering formulae with relative and absolute cell references, filling down a recurrence, using built-in functions, and using goal seek to find an input for a target output.
A focused answer to the SQA Higher Applications of Mathematics use-of-technology content, covering spreadsheet formulae, relative and absolute cell references, filling a recurrence down a column, built-in functions, and goal seek for a target output.
- Understanding the course assessment: the question paper and the statistics project, how marks are split and combined into the A to D grade, and the use of software in both components.
A concise overview of how SQA Higher Applications of Mathematics is assessed, covering the question paper, the statistics project, the mark split and grading, and how software is used across both components.