How do you test whether data fit a distribution or whether two factors are independent, and what non-parametric tests are available?
The chi-squared goodness-of-fit test and contingency table test for independence, degrees of freedom, and non-parametric tests including the sign test and Wilcoxon signed-rank test.
A focused answer to the OCR A-Level Further Mathematics A Statistics option content on chi-squared and non-parametric tests, covering the chi-squared goodness-of-fit test and the contingency table test for independence with their degrees of freedom and expected frequencies, and non-parametric tests including the sign test and the Wilcoxon signed-rank test.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
OCR's Statistics option wants you to carry out a chi-squared goodness-of-fit test (comparing observed and expected frequencies for a proposed distribution) and a chi-squared contingency table test for independence, to compute expected frequencies and the correct degrees of freedom, and to apply non-parametric tests, the sign test and the Wilcoxon signed-rank test, where distributional assumptions cannot be made.
The chi-squared test statistic
Both chi-squared tests use the same statistic, which measures how far the observed frequencies stray from what a hypothesis predicts. A large value means a poor fit and leads to rejecting the null hypothesis.
Goodness-of-fit test
A goodness-of-fit test checks whether data are consistent with a proposed distribution (uniform, binomial, Poisson and so on). The expected frequencies come from the proposed model, and the degrees of freedom subtract one for the total constraint and one more for each parameter estimated from the data.
Contingency tables and independence
A contingency table tests whether two classifications (say treatment and outcome) are independent. The expected frequency for each cell assumes independence, computed from the marginal totals.
Non-parametric tests
When the data cannot be assumed to follow a particular distribution, non-parametric tests work from ranks or signs instead. The sign test tests a hypothesised median by counting how many observations fall above and below it, treating the counts as a binomial. The Wilcoxon signed-rank test is more powerful: it ranks the absolute differences from the hypothesised value and sums the ranks of the positive (or negative) differences, comparing that rank-sum with a critical value.
The chi-squared and non-parametric tests complete the Statistics option, drawing on the distributions (for expected frequencies) and on summation.
Try this
Q1. State the chi-squared test statistic formula. [1 mark]
- Cue. .
Q2. A contingency table has rows and columns. State the degrees of freedom. [1 mark]
- Cue. .
Exam-style practice questions
Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
OCR 20186 marksA die is rolled times, giving frequencies for the faces to . Carry out the calculation of the chi-squared test statistic for the hypothesis that the die is fair, and state the degrees of freedom.Show worked answer →
Under a fair die the expected frequency for each face is (M1).
The test statistic is (M1): (A1).
Compute (A1): (A1).
Degrees of freedom (A1).
Markers reward the expected frequencies, the test-statistic formula, the calculation, the value , and the degrees of freedom.
OCR 20226 marksIn a contingency table with rows and columns, state the degrees of freedom. A cell has observed frequency ; its row total is , its column total is , and the grand total is . Find the expected frequency for that cell and its contribution to the chi-squared statistic.Show worked answer →
Degrees of freedom for an table is (M1): (A1).
Expected frequency (M1): (A1).
Contribution (M1, A1).
Markers reward the degrees-of-freedom formula, the expected-frequency formula, the value , and the zero contribution (observed equals expected here).
Related dot points
- Discrete random variables, the probability distribution, expectation and variance, and the effect of a linear transformation aX + b on the mean and variance.
A focused answer to the OCR A-Level Further Mathematics A Statistics option content on discrete random variables, covering the probability distribution and the condition that probabilities sum to one, the expectation E(X) and variance Var(X), the computational formula for variance, and the effect of a linear transformation aX plus b on the mean and variance.
- Continuous random variables, the probability density function and cumulative distribution function, finding probabilities by integration, and the expectation and variance of a continuous variable.
A focused answer to the OCR A-Level Further Mathematics A Statistics option content on continuous random variables, covering the probability density function and the condition that it integrates to one, finding probabilities by integration, the cumulative distribution function and its relationship to the pdf, and the expectation and variance of a continuous variable.
- The Poisson distribution and its conditions, mean and variance, the sum of independent Poisson variables, the geometric distribution and its mean, and the Poisson approximation to the binomial.
A focused answer to the OCR A-Level Further Mathematics A Statistics option content on the Poisson and geometric distributions, covering the Poisson model and its conditions, its mean and variance both equal to lambda, the sum of independent Poisson variables, the geometric distribution for the number of trials to the first success and its mean, and the Poisson approximation to the binomial.
- The standard results for the sum of r, r squared and r cubed, using them to sum polynomial expressions in r, splitting sums by linearity, and adjusting limits.
A focused answer to the OCR A-Level Further Mathematics A content on the summation of series, covering the standard formulae for the sum of r, r squared and r cubed, using linearity to split a sum of a polynomial in r, evaluating the resulting expression, and adjusting the limits when a sum does not start at one.
- Proof by mathematical induction for summation formulae, divisibility results, recurrence relations and powers of matrices, with a correctly stated base case, inductive hypothesis, inductive step and conclusion.
A focused answer to the OCR A-Level Further Mathematics A content on proof by mathematical induction, covering the structure (base case, inductive hypothesis, inductive step and conclusion) and its use for summation formulae, divisibility results, recurrence relations and powers of a matrix, with the rigorous wording examiners require.