How do you control variables and remove bias so an experiment is a fair test?
Identifying and controlling extraneous variables, control groups and matched pairs, sources of bias, sensitivity of content, and the random response technique for sensitive questions.
A focused answer to Edexcel GCSE Statistics on controlling variables and bias, covering explanatory and extraneous variables, control groups and matched pairs, sources of bias, sensitivity of content, and the random response technique for sensitive questions at Higher tier.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Edexcel codes 1d.03 and 1d.07 require you to identify factors that lead to bias, including the sensitivity of the subject matter, and to know how to minimise distortion; and to understand the importance of identifying and controlling extraneous variables. At Higher tier this extends to using control groups, the advantage of matched pairs, and the random response technique for sensitive questions. The big idea is a fair test: control everything except the variable being investigated.
Explanatory, response and extraneous variables
In an experiment on fertiliser and plant growth, the fertiliser is explanatory and growth is the response. Extraneous variables (light, water, soil, temperature) must be kept the same for every plant. If they vary, you cannot tell whether a difference in growth was caused by the fertiliser or by, say, more sunlight. Controlling extraneous variables is what makes the experiment a fair test.
Control groups and matched pairs
A control group receives no treatment (or the standard treatment) and provides a baseline for comparison. Comparing the treated group with the control isolates the effect of the explanatory variable. Matched pairs (Higher tier) improve fairness further: each individual in the treatment group is paired with a similar individual (same age, sex, starting weight) in the control group, so differences between the people themselves are reduced and the comparison is cleaner.
Sources of bias
Common sources Edexcel expects you to identify and reduce:
- Sampling bias. An incomplete frame or a non-random method (opportunity sampling) misses part of the population.
- Non-response bias. People who do not respond may differ systematically from those who do.
- Self-selection bias. Volunteers (for example a phone-in poll) are not representative.
- Question bias. Leading or emotive wording steers answers.
- Response bias from sensitivity. On embarrassing or risky topics, people may not answer honestly.
To minimise distortion you use random sampling from a good frame, neutral questions, follow-ups to reduce non-response, and special techniques for sensitive questions.
Sensitive questions and the random response technique
When a question is sensitive (illegal, embarrassing or private), people may refuse or lie, biasing the data. The random response technique protects privacy: the respondent secretly uses a random device (a coin, a card) to decide whether to answer the real sensitive question or a harmless one. Because the interviewer does not know which question each person answered, no individual is exposed, yet because the probability of getting each question is known, the overall proportion giving the sensitive answer can still be estimated.
Exam-style practice questions
Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Edexcel 1ST0 20204 marksA gardener wants to test whether a new feed makes tomato plants grow taller. (a) Name the explanatory and the response variables. (b) Describe how the gardener could use a control group, and explain why this makes the experiment a fairer test.Show worked answer →
(a) Explanatory (independent) variable: the feed (whether or not a plant is given the new feed). Response (dependent) variable: the height the plant grows.
(b) Split the plants into two matched groups: one group gets the new feed, the control group gets none (or the usual feed), with everything else (water, light, soil, pot size) kept the same. Comparing the two groups isolates the effect of the feed, because any difference in growth is due to the feed rather than other (extraneous) variables.
Markers reward the correct variables, a clear control-group design, and the idea that controlling other variables isolates the effect being tested.
Edexcel 1ST0 20223 marksA survey asks people whether they have ever cheated in an exam. (a) Explain why people may not answer this question truthfully. (b) Name a technique that could be used to encourage honest answers to a sensitive question.Show worked answer →
(a) The question is sensitive: admitting to cheating is embarrassing or could have consequences, so respondents may not answer honestly (they may deny it), which biases the results. This is response bias.
(b) The random response technique: the respondent secretly uses a random device (such as a coin) to decide whether to answer the sensitive question or a harmless one, so no individual answer reveals their behaviour, but the overall proportion can still be estimated.
Markers reward identifying sensitivity and resulting bias, and naming the random response technique.
Related dot points
- The statistical enquiry cycle: planning a hypothesis, recognising constraints, collecting, processing, interpreting and evaluating, with proactive strategies to manage problems.
A focused answer to Edexcel GCSE Statistics on the statistical enquiry cycle, covering the five stages, writing a testable hypothesis, recognising constraints such as time, cost and ethics, and planning proactive strategies to handle problems like non-response.
- Types of data: raw, quantitative, qualitative, categorical, ordinal, discrete, continuous, ungrouped, grouped, bivariate and multivariate; primary versus secondary; explanatory and response variables; grouping into class intervals.
A focused answer to Edexcel GCSE Statistics on types of data, covering quantitative versus qualitative, categorical and ordinal, discrete and continuous, grouped and bivariate data, primary versus secondary sources, explanatory and response variables, and the effect of grouping into class intervals.
- Population, sampling frame and sample; simple random, systematic, stratified, quota, cluster, judgement and opportunity sampling; selecting random members; calculating strata sizes.
A focused answer to Edexcel GCSE Statistics on sampling, covering population, sampling frame and sample, simple random, systematic, stratified, quota, cluster, judgement and opportunity sampling, selecting random members electronically, and calculating stratified sample sizes.
- Sources of data, reliability and validity, designing questionnaires and data collection sheets, open and closed questions, leading questions, pilots, and cleaning data before processing.
A focused answer to Edexcel GCSE Statistics on collecting data and designing questionnaires, covering data sources, reliability and validity, open and closed questions, designing non-overlapping response boxes, spotting leading questions, pilots, and cleaning data before processing.
Sources & how we know this
- Pearson Edexcel GCSE (9-1) Statistics (1ST0) specification — Pearson Edexcel (2017)