What does the Statistics content cover in OCR GCSE Maths?

It covers types of data and sampling including random and stratified sampling and bias, the mean, median, mode and range including from frequency tables and grouped data, statistical charts such as bar charts, pie charts, frequency polygons, box plots and histograms, and scatter graphs with correlation and the line of best fit. It rewards both calculation and clear interpretation.

Which statistics topics are Higher tier only in OCR maths?

Higher tier adds histograms with unequal class widths and frequency density, stratified sampling calculations, the interquartile range as a measure of spread, and more demanding interpretation including correlation versus causation. Foundation candidates still meet types of data, the four averages, common charts, scatter graphs and the line of best fit, but not formal histograms with frequency density.

How do you find the mean from a frequency table?

Multiply each value by its frequency, add these products to get the total, and divide by the total frequency, which is the number of items. For grouped data, use the midpoint of each class as the representative value, which gives an estimate of the mean because the exact values within each class are not known. Dividing by the number of rows instead of the total frequency is a common error.

How does a histogram differ from a bar chart?

A bar chart has bars of equal width with gaps, and the height shows the frequency, used for categorical data. A histogram is for continuous grouped data, the bars touch, and the vertical axis is frequency density, which is frequency divided by class width. In a histogram the area of each bar, not its height, represents the frequency, which matters when the class widths are unequal.

What is the difference between correlation and causation?

Correlation means two variables tend to change together, shown by the trend of points on a scatter graph as positive, negative or none. Causation means one variable actually causes the change in the other. A correlation does not prove causation, because a third factor may influence both variables, or the real cause may act in the opposite direction. Exam questions often ask you to challenge a claimed causal link.

EnglandMaths

OCR GCSE Mathematics Statistics: a complete overview of sampling, averages, charts and correlation

A deep-dive OCR GCSE Mathematics guide to the Statistics content. Covers sampling and types of data, averages and spread, statistical charts and graphs, and scatter graphs and correlation, with the methods and exam patterns OCR repeats across Foundation and Higher tier.

Generated by Claude Opus 4.814 min readJ560 SUpdated 2026-06-02

Reviewed by: AI editorial process; not yet individually human-reviewed

Jump to a section

What the Statistics content demands
Sampling and types of data
Averages and spread
Statistical charts and graphs
Scatter graphs and correlation
Check your knowledge

What the Statistics content demands

Statistics applies number and reasoning to data, and it rewards both accurate calculation and clear interpretation. The content runs from sampling and types of data through averages and spread to charts and scatter graphs. Because so many questions ask you to "describe", "compare" or "explain", this area carries a large share of the AO2 communication marks, and the data-handling contexts often test AO3 problem solving.

This guide walks through the four areas of the Statistics content and ties together the matching dot-point pages, each of which has its own practice questions.

Sampling and types of data

Data is qualitative (categories) or quantitative (numbers), and quantitative data is discrete (counted) or continuous (measured). A population is everyone of interest; a sample is a manageable subset. A random sample gives every member an equal chance; a stratified sample takes the same fraction from each group, so groups are represented in proportion. A sample is biased if it is not representative, and explaining precisely why earns the marks.

Averages and spread

The mean is the total divided by the count, the median is the middle of the ordered data, the mode is the most common value, and the range (largest minus smallest) measures spread. From a frequency table, the mean is $\dfrac{\sum fx}{\sum f}$ ; for grouped data, use class midpoints to estimate it. Compare two data sets using both an average (location) and the range or, at Higher tier, the interquartile range (spread).

Statistical charts and graphs

Bar charts compare categories; pie charts show proportions as angles out of $360^\circ$ . Frequency polygons join class midpoints, stem-and-leaf diagrams keep the raw values, and box plots show the five-number summary. At Higher tier, histograms use frequency density ( $\dfrac{\text{frequency}}{\text{class width}}$ ), so the area of each bar, not its height, represents the frequency, which matters for unequal class widths.

Scatter graphs and correlation

A scatter graph plots two variables together. Correlation is positive (both rise), negative (one rises as the other falls) or none. A line of best fit follows the trend and is used to estimate one variable from the other. Estimating within the data range (interpolation) is reliable; beyond it (extrapolation) is not. Correlation does not prove causation, because a third factor may drive both variables.

Check your knowledge

A mix of sampling, average, chart and correlation questions. Attempt them under timed conditions, then check against the solutions.

State whether shoe size is discrete or continuous data. (1 mark)
Find the median of $6, 2, 9, 4, 7, 2, 8$ . (2 marks)
Find the mean of $5, 8, 8, 11$ . (2 marks)
A school of $800$ takes a stratified sample of $40$ . How many from a year group of $200$ ? (2 marks)
In a pie chart of $90$ people, a sector is $40^\circ$ . How many people does it represent? (2 marks)
A histogram class $0 \le x < 5$ has frequency density $6$ . Find the frequency. (2 marks)
Describe the correlation expected between a car's age and its value. (1 mark)
State whether estimating beyond the data range is interpolation or extrapolation. (1 mark)

Solutions

Step 1: Q1 - discrete versus continuous data

The number of siblings can only be a whole number such as 0, 1, 2, 3. It cannot take fractional values, so it is discrete data.

Answer: discrete (it takes set values in steps).

Step 2: Q2 - finding the median

Order the values from smallest to largest. For 7 values, the middle is the 4th value:

Ordered: $2, 2, 4, 6, 7, 8, 9$ ; the middle (fourth) value is $6$ .

Step 3: Q3 - calculating the mean

Add all the values and divide by how many there are:

\frac{5 + 8 + 8 + 11}{4} = \frac{32}{4} = 8

Step 4: Q4 - stratified sampling

The sampling fraction is the overall sample size divided by the total population. Apply that same fraction to the stratum to find how many to select from it:

Sampling fraction $= \dfrac{40}{800} = \dfrac{1}{20}$ ; number from the stratum $= \dfrac{1}{20} \times 200 = 10$ .

Step 5: Q5 - reading a pie chart

Divide the total number of people by $360$ to find how many people each degree represents, then multiply by the sector angle:

Each degree $= \dfrac{90}{360} = 0.25$ people; sector of $40^\circ$ represents $40 \times 0.25 = 10$ people.

Step 6: Q6 - frequency from a histogram

On a histogram the area of a bar represents frequency. Multiply frequency density by the class width:

\text{frequency} = \text{density} \times \text{width} = 6 \times 5 = 30

Step 7: Q7 - describing correlation

As age increases, the car's value tends to decrease. The two variables move in opposite directions, giving negative correlation.

Answer: negative correlation (older cars are worth less).

Step 8: Q8 - prediction outside the data range

Using a line of best fit beyond the range of the collected data is extrapolation. It is unreliable because the trend may not continue outside the observed range.

Answer: extrapolation.

Sources & how we know this

OCR GCSE (9-1) Mathematics (J560) specification — OCR (2015)