How do you choose a representative sample, classify types of data, and design fair data collection?
Populations and samples, representative and biased sampling, random sampling, types of data (qualitative and quantitative, discrete and continuous), and designing questionnaires and data collection.
A focused answer to the Edexcel GCSE Mathematics statistics content on sampling and data, covering populations and samples, representative and biased sampling, random sampling, types of data, and designing fair data collection.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Edexcel expects you to understand populations and samples, to judge whether a sample is representative or biased, to describe random sampling, to classify types of data, and to criticise and improve data-collection methods such as questionnaires. This is the reasoning side of statistics, and questions reward precise, specific explanations rather than vague comments.
Populations and samples
Studying a whole population is often impractical, so we use a sample to make inferences.
A sample is used because surveying everyone is usually too slow or expensive. The trade-off is that conclusions are only as good as the sample, so the way the sample is chosen matters enormously.
Bias and representative samples
A biased sample systematically favours some part of the population, distorting the results.
Exam questions often give a flawed method and ask for specific reasons it is biased, as in the school-lunches question. A strong answer names the precise problem (for example, "asking only people leaving a gym will over-represent those who exercise").
Random sampling
Random sampling is the standard way to reduce bias, giving everyone an equal chance.
In a simple random sample, every member of the population has an equal chance of being chosen, for example by numbering everyone and using random numbers to select. This avoids the human tendency to pick conveniently, and it is the method to recommend when a question asks how to make a sample fairer.
Types of data
Classifying data correctly decides which charts and averages are appropriate.
The discrete-continuous distinction matters because continuous data is grouped into class intervals and shown with histograms or frequency polygons, while discrete data can use bar charts.
Designing data collection
A good questionnaire collects honest, usable data, so the wording and response options must be fair.
Exam-style practice questions
Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Edexcel 20182 marksA head teacher wants to find out what students think of school lunches. She asks the first students who arrive at school one morning. Give two reasons why this sample may not be representative. (Paper 2, calculator.)Show worked answer →
Identify two distinct sources of bias in the method.
- The first to arrive may not represent all students; for example they may live nearby or always be early, so they are a particular type of student.
- The sample is small () compared with the whole school, so it may miss the range of opinions.
Markers award a mark for each valid, distinct reason. Vague answers such as "it is unfair" without explanation do not earn marks; the reasons must be specific.
Edexcel 20212 marksA survey question reads: 'Do you agree that our excellent library should stay open longer?' Write down one criticism of this question and write an improved version. (Paper 2, calculator.)Show worked answer →
The question is leading (biased), because the word "excellent" encourages a "yes".
Criticism: it is a leading question that pushes the respondent towards agreeing.
Improved version: "Should the library opening hours be changed?" with response boxes such as "Yes / No / Not sure", which is neutral.
Markers award a mark for identifying the leading nature and a mark for a fair, neutral rewrite with non-overlapping response options.
Related dot points
- The mean, median, mode and range; finding averages from frequency tables and from grouped data using the midpoint and an estimated mean; and comparing distributions using an average and the range.
A focused answer to the Edexcel GCSE Mathematics statistics content on averages and spread, covering the mean, median, mode and range, finding averages from frequency tables and grouped data, and comparing distributions.
- Drawing and interpreting statistical diagrams: bar charts, pictograms, pie charts, frequency polygons, cumulative frequency graphs and box plots, and finding the median, quartiles and interquartile range (Higher tier).
A focused answer to the Edexcel GCSE Mathematics statistics content on charts and graphs, covering bar charts, pie charts, frequency polygons, cumulative frequency graphs and box plots, and finding the median, quartiles and interquartile range.
- Scatter graphs and bivariate data, describing correlation (positive, negative or none), drawing and using a line of best fit to estimate values, and recognising the dangers of extrapolation and correlation versus causation.
A focused answer to the Edexcel GCSE Mathematics statistics content on scatter graphs and correlation, covering bivariate data, describing correlation, drawing and using a line of best fit, and the limits of extrapolation and correlation versus causation.
- Estimating probability from experimental data using relative frequency, comparing experimental and theoretical probability, and calculating the expected number of outcomes from a probability.
A focused answer to the Edexcel GCSE Mathematics probability content on relative frequency and expected outcomes, covering estimating probability from experiments, comparing experimental and theoretical probability, and predicting the expected number of outcomes.
- Venn diagrams for two or three sets, set notation (union, intersection and complement), and using a completed Venn diagram to find probabilities including conditional probability (Higher tier).
A focused answer to the Edexcel GCSE Mathematics probability content on Venn diagrams and set notation, covering two and three set diagrams, union, intersection and complement notation, and finding probabilities from a Venn diagram including conditional probability.
Sources & how we know this
- Pearson Edexcel GCSE (9-1) Mathematics (1MA1) specification — Pearson Edexcel (2015)