Skip to main content
EnglandFurther MathsSyllabus dot point

How do you test whether data fit a distribution or whether two factors are independent, and what non-parametric tests are available?

The chi-squared goodness-of-fit test and contingency table test for independence, degrees of freedom, and non-parametric tests including the sign test and Wilcoxon signed-rank test.

A focused answer to the OCR A-Level Further Mathematics A Statistics option content on chi-squared and non-parametric tests, covering the chi-squared goodness-of-fit test and the contingency table test for independence with their degrees of freedom and expected frequencies, and non-parametric tests including the sign test and the Wilcoxon signed-rank test.

Generated by Claude Opus 4.812 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. The chi-squared test statistic
  3. Goodness-of-fit test
  4. Contingency tables and independence
  5. Non-parametric tests
  6. Try this

What this dot point is asking

OCR's Statistics option wants you to carry out a chi-squared goodness-of-fit test (comparing observed and expected frequencies for a proposed distribution) and a chi-squared contingency table test for independence, to compute expected frequencies and the correct degrees of freedom, and to apply non-parametric tests, the sign test and the Wilcoxon signed-rank test, where distributional assumptions cannot be made.

The chi-squared test statistic

Both chi-squared tests use the same statistic, which measures how far the observed frequencies stray from what a hypothesis predicts. A large value means a poor fit and leads to rejecting the null hypothesis.

Goodness-of-fit test

A goodness-of-fit test checks whether data are consistent with a proposed distribution (uniform, binomial, Poisson and so on). The expected frequencies come from the proposed model, and the degrees of freedom subtract one for the total constraint and one more for each parameter estimated from the data.

Contingency tables and independence

A contingency table tests whether two classifications (say treatment and outcome) are independent. The expected frequency for each cell assumes independence, computed from the marginal totals.

Non-parametric tests

When the data cannot be assumed to follow a particular distribution, non-parametric tests work from ranks or signs instead. The sign test tests a hypothesised median by counting how many observations fall above and below it, treating the counts as a binomial. The Wilcoxon signed-rank test is more powerful: it ranks the absolute differences from the hypothesised value and sums the ranks of the positive (or negative) differences, comparing that rank-sum with a critical value.

The chi-squared and non-parametric tests complete the Statistics option, drawing on the distributions (for expected frequencies) and on summation.

Try this

Q1. State the chi-squared test statistic formula. [1 mark]

  • Cue. χ2=(OE)2E\chi^2 = \displaystyle\sum \dfrac{(O - E)^2}{E}.

Q2. A contingency table has 44 rows and 33 columns. State the degrees of freedom. [1 mark]

  • Cue. (41)(31)=3×2=6(4 - 1)(3 - 1) = 3 \times 2 = 6.

Exam-style practice questions

Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

OCR 20186 marksA die is rolled 120120 times, giving frequencies 18,22,17,25,20,1818, 22, 17, 25, 20, 18 for the faces 11 to 66. Carry out the calculation of the chi-squared test statistic for the hypothesis that the die is fair, and state the degrees of freedom.
Show worked answer →

Under a fair die the expected frequency for each face is 1206=20\dfrac{120}{6} = 20 (M1).

The test statistic is χ2=(OE)2E\chi^2 = \displaystyle\sum \dfrac{(O - E)^2}{E} (M1): =(1820)220+(2220)220+(1720)220+(2520)220+(2020)220+(1820)220= \dfrac{(18-20)^2}{20} + \dfrac{(22-20)^2}{20} + \dfrac{(17-20)^2}{20} + \dfrac{(25-20)^2}{20} + \dfrac{(20-20)^2}{20} + \dfrac{(18-20)^2}{20} (A1).

Compute (A1): =4+4+9+25+0+420=4620=2.3= \dfrac{4 + 4 + 9 + 25 + 0 + 4}{20} = \dfrac{46}{20} = 2.3 (A1).

Degrees of freedom =61=5= 6 - 1 = 5 (A1).

Markers reward the expected frequencies, the test-statistic formula, the calculation, the value 2.32.3, and the degrees of freedom.

OCR 20226 marksIn a contingency table with 33 rows and 44 columns, state the degrees of freedom. A cell has observed frequency 3030; its row total is 9090, its column total is 8080, and the grand total is 240240. Find the expected frequency for that cell and its contribution to the chi-squared statistic.
Show worked answer →

Degrees of freedom for an r×cr \times c table is (r1)(c1)(r - 1)(c - 1) (M1): (31)(41)=2×3=6(3 - 1)(4 - 1) = 2 \times 3 = 6 (A1).

Expected frequency =row total×column totalgrand total= \dfrac{\text{row total} \times \text{column total}}{\text{grand total}} (M1): =90×80240=7200240=30= \dfrac{90 \times 80}{240} = \dfrac{7200}{240} = 30 (A1).

Contribution (OE)2E=(3030)230=0\dfrac{(O - E)^2}{E} = \dfrac{(30 - 30)^2}{30} = 0 (M1, A1).

Markers reward the degrees-of-freedom formula, the expected-frequency formula, the value 3030, and the zero contribution (observed equals expected here).

Related dot points

Sources & how we know this