How do you identify types of data, design a sample, and recognise bias in data collection?
Identify types of data (qualitative and quantitative, discrete and continuous); understand populations and samples; use random and stratified sampling; and recognise sources of bias.
A focused answer to the OCR GCSE Mathematics statistics content on sampling and data, covering types of data, populations and samples, random and stratified sampling, and recognising bias in data collection.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
OCR reference S1 covers types of data, populations and samples, sampling methods (including random and stratified) and recognising bias. Good data collection underpins every statistical conclusion, so this content tests the reasoning behind a study as much as any calculation. It appears on every tier, with stratified sampling and bias being reliable Higher and AO2 question types, where a clear, justified explanation earns the marks.
Types of data
Data comes in distinct types that determine how it is handled.
So shoe size is discrete (it jumps in steps), while foot length is continuous (it can be any value). Knowing the type matters because continuous data is grouped into class intervals and shown on a histogram, whereas discrete data may be shown on a bar chart. The distinction guides the choice of chart and average.
Populations and samples
A sample stands in for a population that is too large to survey fully.
So to study the heights of all Year students in a country (the population), you might measure a sample of a few hundred. The sample must reflect the population's variety, or any conclusion drawn from it will be misleading. Larger samples generally give more reliable results.
Random and stratified sampling
The sampling method affects how representative the sample is.
So for a stratified sample of from a population of , the sampling fraction is , and a stratum of people contributes to the sample. Stratified sampling ensures small groups are not missed and large groups are not over-represented, which a simple random sample might do by chance.
Recognising bias
A biased sample gives misleading conclusions.
A sample is biased if some members of the population are more likely to be chosen than others, so it is not representative. Surveying only gym-goers about exercise over-represents active people; asking only at one time or place can miss whole groups. To reduce bias, sample randomly from the whole target population and make the sample large enough. Explaining precisely why a sample is unrepresentative, not just calling it "unfair", is the AO2 skill OCR rewards.
Exam-style practice questions
Practice questions written in the style of OCR exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
OCR 20193 marksA school has students. A head teacher wants to survey a stratified sample of students by year group. Year has students. How many Year students should be in the sample? (Higher, Paper 4, calculator.)Show worked answer →
A stratified sample takes the same fraction from each group.
The sampling fraction is .
Apply it to Year : students.
Markers award a mark for the sampling fraction, a mark for applying it to Year , and a mark for . Taking divided by the number of year groups, instead of using the proportion in each group, is the standard error.
OCR 20213 marksA researcher surveys people leaving a gym about how much exercise they do per week. Give one reason why this sample is likely to be biased, and suggest a better method. (Foundation, Paper 1, calculator.)Show worked answer →
People leaving a gym are more likely to exercise a lot, so the sample is not representative of the whole population: it over-represents active people.
A better method would be to take a random sample of the whole target population, for example randomly selecting from a full list of residents rather than only gym-goers.
Markers give a mark for identifying the bias (gym-goers exercise more), a mark for explaining why it is unrepresentative, and a mark for a sensible improvement. A vague answer such as "it is unfair" without explaining the over-representation does not score fully.
Related dot points
- Calculate the mean, median, mode and range; find the mean from a frequency table and an estimated mean from grouped data; and compare distributions using an average and the range (and quartiles at Higher tier).
A focused answer to the OCR GCSE Mathematics statistics content on averages and spread, covering the mean, median, mode and range, the mean from frequency tables, the estimated mean from grouped data, and comparing distributions.
- Draw and interpret statistical charts including bar charts, pie charts, frequency polygons, stem-and-leaf diagrams, box plots and histograms with unequal class widths (histograms at Higher tier).
A focused answer to the OCR GCSE Mathematics statistics content on statistical charts and graphs, covering bar charts, pie charts, frequency polygons, stem-and-leaf diagrams, box plots, and histograms with frequency density at Higher tier.
- Plot and interpret scatter graphs; describe correlation; draw a line of best fit; use it to estimate values; and understand interpolation, extrapolation and the difference between correlation and causation.
A focused answer to the OCR GCSE Mathematics statistics content on scatter graphs and correlation, covering positive, negative and no correlation, drawing a line of best fit, making predictions, and the difference between correlation and causation.
- Use Venn diagrams and set notation (union, intersection and complement) to represent and count outcomes and to calculate probabilities, including conditional probability (Higher tier).
A focused answer to the OCR GCSE Mathematics probability content on Venn diagrams and set notation, covering union, intersection and complement, representing data, and calculating probabilities including conditional probability at Higher tier.
- Use relative frequency (experimental probability) to estimate probabilities from data, understand how more trials improve the estimate, and calculate expected numbers of outcomes.
A focused answer to the OCR GCSE Mathematics probability content on relative frequency and expected outcomes, covering experimental probability, the effect of more trials, fairness, and calculating expected numbers of outcomes.
Sources & how we know this
- OCR GCSE (9-1) Mathematics (J560) specification — OCR (2015)