EnglandGeographySyllabus dot point

How are statistical techniques used to analyse geographical data and test relationships?

The statistical techniques used in geography (measures of central tendency and dispersion, percentage change, Spearman's rank correlation and significance testing) and how to calculate, apply and critically interpret them in geographical contexts.

An OCR A-Level Geography answer to the statistical skills embedded across all components, covering measures of central tendency (mean, median, mode) and dispersion (range, interquartile range, standard deviation), percentage change, Spearman's rank correlation and significance testing, with worked calculations in KaTeX and guidance on critical interpretation for AO3.

Generated by Claude Opus 4.812 min answerUpdated 2026-06-02

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this dot point is asking
The answer
Examples in context
Try this

What this dot point is asking

OCR embeds statistical skills across all components and the Independent Investigation. You need to calculate, apply and critically interpret measures of central tendency and dispersion, percentage change, Spearman's rank correlation and significance testing, in geographical contexts. These are tested as AO3 in Papers 01 and 02 and are central to analysing fieldwork data.

The answer

Central tendency and dispersion

The choice of measure matters geographically. The mean uses all values but is distorted by outliers (one huge value pulls it up), so for skewed data, incomes, settlement sizes, the median is often more representative. The interquartile range, $IQR = Q_3 - Q_1$ , measures the spread of the middle $50\%$ and, like the median, resists outliers. The standard deviation, $\sigma = \sqrt{\dfrac{\sum (x - \bar{x})^2}{n}}$ , measures the average distance of values from the mean: a small $\sigma$ means values cluster tightly (a well-sorted beach, a uniform region), a large $\sigma$ means they are widely spread. Reporting both a central value and a measure of spread gives a fuller picture than either alone.

Percentage change

Percentage change standardises change so that places or times of different sizes can be compared. It is calculated as

\text{percentage change} = \frac{\text{new value} - \text{original value}}{\text{original value}} \times 100

A positive result is an increase, a negative result a decrease. The strength of the measure is comparability: a rise of $50\,000$ people means very different things in a town of $100\,000$ ( $+50\%$ ) and a city of $5$ million ( $+1\%$ ). The caution is that percentage change can mislead from a small base (a doubling from $2$ to $4$ is $+100\%$ but trivial in absolute terms), so it should be read alongside absolute figures.

Spearman's rank correlation

Spearman's rank correlation coefficient ( $r_s$ ) tests the strength and direction of the relationship between two variables by ranking them. It is calculated as

r_s = 1 - \frac{6 \sum d^2}{n^3 - n}

where $d$ is the difference between the ranks of each paired observation and $n$ is the number of pairs. The result lies between $+1$ (perfect positive correlation, both rise together), $0$ (no correlation) and $-1$ (perfect negative correlation, one rises as the other falls). It is widely used in geography (for example testing whether pebble size decreases downstream, or whether deprivation correlates with distance from a city centre) because it works on ranked data and does not assume a normal distribution.

Significance testing

A calculated $r_s$ must be checked for statistical significance to judge whether the relationship is real or could be due to chance. The calculated value is compared with a critical value from a significance table for the relevant sample size ( $n$ ) and chosen significance level, commonly $0.05$ (the $95\%$ confidence level, meaning a $5\%$ chance the result is random). If the calculated $r_s$ exceeds the critical value, the result is significant and the null hypothesis (that there is no relationship) is rejected; if it does not, the relationship cannot be distinguished from chance. Significance depends strongly on sample size: the same coefficient is significant in a large sample but not a small one, which is why fieldwork needs an adequate, justified sample.

Worked example: calculating Spearman's rank correlation

Five sites give paired ranks for distance downstream and pebble roundness, producing rank differences $d$ of $1, 0, -1, 1, -1$ . Calculate $r_s$ .

Square the rank differences

$d^2 = 1, 0, 1, 1, 1$ , so $\sum d^2 = 4$ . With $n = 5$ , $n^3 - n = 125 - 5 = 120$ .

Apply the formula

$r_s = 1 - \frac{6 \times 4}{120} = 1 - \frac{24}{120} = 1 - 0.2 = +0.8$ .

Interpret and qualify

$r_s = +0.8$ is a strong positive correlation, supporting the hypothesis that roundness increases downstream, but with only $n = 5$ it must be checked against the critical value (high for small samples), so it may not be significant. The causation caveat applies: the result is consistent with attrition rounding pebbles downstream but does not on its own prove that mechanism.

Examples in context

Example 1. Testing the Bradshaw model with Spearman's rank. A river study might rank measured values of, say, velocity against distance downstream at several sites and compute $r_s$ to test the predicted positive relationship. A strong positive coefficient, checked against the critical value for the sample size, supports the model, while a non-significant result prompts reflection on sample size or confounding factors. This shows statistical testing applied to a physical-geography hypothesis and is a classic Independent Investigation technique.

Example 2. Comparing deprivation with standard deviation. Comparing two cities' neighbourhood deprivation scores, the mean might be similar, but a larger standard deviation in one city reveals greater inequality (deprivation more polarised between rich and poor areas), a difference the mean alone hides. This demonstrates why dispersion matters as much as central tendency in human geography, and links to the Changing Spaces; Making Places analysis of uneven places.

Try this

Q1. State the formula for percentage change. [2 marks]

Cue. $\frac{\text{new value} - \text{original value}}{\text{original value}} \times 100$ .

Q2. A Spearman's rank coefficient is $-0.85$ and exceeds the critical value at the $0.05$ level. Interpret this result. [3 marks]

Cue. A strong negative correlation (as one variable rises, the other falls) that is statistically significant at the $95\%$ confidence level, so the null hypothesis of no relationship is rejected; note that correlation does not prove causation.

Exam-style practice questions

Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

OCR H481/01 (style)4 marksA data set of ten beach pebble sizes (mm) is provided. Calculate the median and the interquartile range, showing your working.

Show worked answer →

A low-tariff AO3 calculation. Reward the correct method. To find the median, rank the values and take the middle: for $n = 10$ the median is the mean of the 5th and 6th ranked values. The interquartile range (IQR) is the upper quartile minus the lower quartile, $IQR = Q_3 - Q_1$ , where $Q_1$ is the value at rank $\frac{n+1}{4}$ and $Q_3$ at rank $\frac{3(n+1)}{4}$ (interpolating between ranks as needed).
The strongest answers show clear working and then interpret: the median resists distortion by extreme pebbles (unlike the mean), and the IQR measures the spread of the middle $50\%$ , so a small IQR indicates a well-sorted beach. Reward correct ranking, the right quartile positions, and a sentence of geographical interpretation rather than a bare number.

OCR H481/02 (style)6 marksA Spearman's rank correlation coefficient of +0.78 is calculated between two variables. Explain what this shows and how its significance would be tested.

Show worked answer →

A medium-tariff AO3 question on correlation and significance. Reward candidates who interpret the coefficient: Spearman's rank $r_s$ ranges from $+1$ (perfect positive) through $0$ (none) to $-1$ (perfect negative), so $+0.78$ indicates a strong positive relationship (as one variable rises, so does the other). For significance, the calculated $r_s$ is compared with a critical value from a table at a chosen significance level (commonly $0.05$ , the $95\%$ confidence level) for the sample size ( $n$ ); if $r_s$ exceeds the critical value, the relationship is statistically significant and unlikely to be due to chance, so the null hypothesis (no relationship) is rejected.
The strongest answers stress that correlation does not prove causation, that a third factor may drive both variables, and that significance depends on sample size. Reward correct interpretation, the comparison-with-critical-value method, and the causation caveat.

Related dot points

Sources & how we know this

OCR A-Level Geography (H481) specification — OCR (2016)