ScotlandStatisticsSyllabus dot point

How do you test whether categorical data fit a model or whether two categorical variables are associated?

Carry out the chi-squared goodness-of-fit test and the chi-squared test for association in a contingency table, computing expected frequencies, the chi-squared statistic and degrees of freedom, and interpreting the result against the assumptions.

A focused answer to the SQA Advanced Higher Statistics chi-squared content: the goodness-of-fit test and the test for association in a contingency table, computing expected frequencies, the chi-squared statistic and degrees of freedom, the minimum expected frequency rule, and interpreting the outcome.

Generated by Claude Opus 4.813 min answerUpdated 2026-06-16

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this dot point is asking
The chi-squared statistic
The goodness-of-fit test
The contingency table test for association
The expected-frequency assumption
Try this

What this dot point is asking

When the data are counts in categories rather than measurements, the chi-squared family is the tool. The SQA wants you to run two tests: a goodness-of-fit test (do observed category counts match a proposed model?) and a test for association in a contingency table (are two categorical variables independent?). For each you compute expected frequencies, the chi-squared statistic and the degrees of freedom, then judge against the assumptions.

The chi-squared statistic

Both tests share the same statistic, which measures the total relative discrepancy between observed and expected counts.

Because every term is squared and divided by the expected count, a cell where the observed count is far from expected contributes a lot, while a close match contributes almost nothing.

The goodness-of-fit test

This tests whether observed counts across categories are consistent with a proposed distribution.

Run a goodness-of-fit test

Find the expected frequencies

A bag is claimed to hold colours in ratio $2:1:1$ . For $n = 80$ draws, the probabilities are $\tfrac{1}{2}, \tfrac{1}{4}, \tfrac{1}{4}$ , giving expected counts $40, 20, 20$ .

Compute the chi-squared statistic

With observed $45, 18, 17$ : $\chi^2 = \dfrac{(45 - 40)^2}{40} + \dfrac{(18 - 20)^2}{20} + \dfrac{(17 - 20)^2}{20} = \dfrac{25}{40} + \dfrac{4}{20} + \dfrac{9}{20} = 0.625 + 0.2 + 0.45 = 1.275$ .

Compare on the right degrees of freedom

Three categories, no parameters estimated, so $3 - 1 = 2$ degrees of freedom. As $1.275$ is small (well below $\chi^2_{0.05,2} = 5.99$ ), do not reject the claimed ratio.

The contingency table test for association

This tests whether two categorical variables are independent.

The expected-frequency assumption

The chi-squared distribution is only an approximation to the true distribution of the statistic, and it relies on the expected counts not being too small.

Try this

Q1. A goodness-of-fit test compares observed counts across $5$ categories against a fully specified model. State the degrees of freedom. [1 mark]

Cue. No parameters are estimated, so degrees of freedom $= 5 - 1 = 4$ .

Q2. In a contingency table, a cell has row total $60$ , column total $30$ and grand total $180$ . Find its expected frequency. [1 mark]

Cue. $E = \dfrac{60 \times 30}{180} = \dfrac{1800}{180} = 10$ .

Exam-style practice questions

Practice questions written in the style of SQA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AH style: goodness of fit4 marksA die is rolled

60

times with frequencies

8, 9, 11, 10, 13, 9

for faces

1

6

. Test at the

5\%

level whether the die is fair. (Use

\chi^2_{0.05, 5} = 11.07

Show worked answer →

Hypotheses: $H_0$ that the die is fair (each face equally likely) against $H_1$ that it is not (1 mark).

Each expected frequency is $\dfrac{60}{6} = 10$ . Compute $\chi^2 = \sum \dfrac{(O - E)^2}{E}$ : $\dfrac{4}{10} + \dfrac{1}{10} + \dfrac{1}{10} + \dfrac{0}{10} + \dfrac{9}{10} + \dfrac{1}{10} = \dfrac{16}{10} = 1.6$ (2 marks).

Degrees of freedom $= 6 - 1 = 5$ ; since $1.6 < 11.07$ , do not reject $H_0$ : there is insufficient evidence at the $5\%$ level that the die is unfair (1 mark). Markers reward the hypotheses, the expected frequencies, the chi-squared statistic and the conclusion with degrees of freedom.

AH style: contingency3 marksA

3 \times 2

contingency table is tested for association. State how to find an expected frequency and the degrees of freedom for the test.

Show worked answer →

The expected frequency for a cell is $E = \dfrac{(\text{row total}) \times (\text{column total})}{\text{grand total}}$ , computed under $H_0$ that the two variables are independent (1 mark).

Degrees of freedom for an $r \times c$ table are $(r - 1)(c - 1)$ ; here $(3 - 1)(2 - 1) = 2$ (1 mark).

The statistic $\chi^2 = \sum \dfrac{(O - E)^2}{E}$ is then compared with the critical value on $2$ degrees of freedom; a large value gives evidence of association (1 mark). Markers reward the expected-frequency formula, the degrees of freedom and the test structure.

Related dot points

Sources & how we know this

SQA Advanced Higher Statistics Course Specification (C803 77) — SQA (2023)
SQA Advanced Higher Statistics Data Booklet — SQA (2019)