Skip to main content
EnglandMathsSyllabus dot point

How do you identify types of data, design a sample, and recognise bias in data collection?

Identify types of data (qualitative and quantitative, discrete and continuous); understand populations and samples; use random and stratified sampling; and recognise sources of bias.

A focused answer to the Eduqas GCSE Mathematics statistics content on sampling and data, covering qualitative and quantitative data, discrete and continuous data, populations and samples, random and stratified sampling, and sources of bias.

Generated by Claude Opus 4.810 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Types of data
  3. Populations and samples
  4. Random and stratified sampling
  5. Recognising bias

What this dot point is asking

The Eduqas statistics content begins with data and sampling: identifying the type of data (qualitative or quantitative, discrete or continuous), understanding the difference between a population and a sample, using random and stratified sampling, and recognising sources of bias. This is the foundation of the statistical handling cycle, and Eduqas tests it through both calculation (stratified sample sizes) and reasoning (why a sample is biased, how to improve it). It appears at both tiers, with the stratified-sample calculation and the bias-explanation being reliable questions.

Types of data

Data is classified first as qualitative or quantitative, and quantitative data is then discrete or continuous.

So shoe size is discrete, height is continuous, and team supported is qualitative. The distinction matters because continuous data is grouped into class intervals and shown with histograms, while discrete data can be shown with bar charts, so identifying the type points to the right chart.

Populations and samples

It is rarely practical to collect data from a whole population, so a sample stands in for it.

Sampling saves time and cost, but only works if the sample reflects the population. A census surveys the whole population, which is accurate but expensive, so sampling is the usual compromise.

Random and stratified sampling

Two named sampling methods appear in the specification.

Stratified sampling is the one most often calculated: find the sampling fraction sample sizepopulation size\dfrac{\text{sample size}}{\text{population size}}, then multiply each group's size by it.

Recognising bias

Bias occurs when a sampling method systematically over- or under-represents part of the population.

A survey about reading habits conducted in a library will be biased because library users read more than the general population, so the result is unrepresentative. Common causes are sampling from the wrong place (gym-goers for exercise), self-selection (only motivated people respond), and leading questions. To improve a biased sample, choose a method that gives a fair cross-section of the whole population. Because Eduqas weights reasoning heavily, a strong answer explains precisely why the group is unrepresentative and what the effect on the results would be.

Exam-style practice questions

Practice questions written in the style of WJEC Eduqas exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Eduqas 20183 marksA school has 600 students. A sample of 50 is to be taken, stratified by year group. There are 120 students in Year 7. How many Year 7 students should be in the sample? (Foundation, Component 2, calculator.)
Show worked answer →

Stratified sampling takes the same fraction from each group.

The sampling fraction is 50600=112\dfrac{50}{600} = \dfrac{1}{12}.

Year 7 contribution: 112×120=10\dfrac{1}{12} \times 120 = 10 students.

Markers award a mark for the sampling fraction, a mark for the method, and a mark for the answer 10. Taking an equal number from each year (ignoring the different group sizes) is the standard error.

Eduqas 20223 marksA researcher surveys people leaving a gym about how much they exercise. Explain why this sample is likely to be biased, and suggest one improvement. (Higher, Component 2, calculator.)
Show worked answer →

People leaving a gym already exercise more than average, so the sample is not representative of the whole population. This is sampling bias.

The result would overestimate how much the population exercises.

An improvement would be to sample from the general public, for example a random selection from the local population, rather than only gym-goers.

Markers give marks for identifying that gym-goers are unrepresentative, for explaining the resulting bias, and for a sensible improvement. A vague answer that does not explain why the group is unrepresentative loses marks.

Related dot points

Sources & how we know this