How do you classify data, and why does the type decide what you can do with it?
Types of data: raw, quantitative, qualitative, categorical, ordinal, discrete, continuous, ungrouped, grouped, bivariate and multivariate; primary versus secondary; explanatory and response variables; grouping into class intervals.
A focused answer to Edexcel GCSE Statistics on types of data, covering quantitative versus qualitative, categorical and ordinal, discrete and continuous, grouped and bivariate data, primary versus secondary sources, explanatory and response variables, and the effect of grouping into class intervals.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Edexcel codes 1b.01 to 1b.04 require you to classify data correctly and to know why the classification matters. You must apply the full vocabulary (raw, quantitative, qualitative, categorical, ordinal, discrete, continuous, ungrouped, grouped, bivariate and, at Higher tier, multivariate), distinguish primary from secondary data, use the terms explanatory and response variable, and understand the effect of grouping numerical data into class intervals. The data type decides which diagrams and averages are valid later, so this is a recurring first step.
Quantitative and qualitative
Edexcel notes that more than one term may apply to the same data, so be ready to give several labels. For example, the finishing position in a race is quantitative, discrete and ordinal.
Discrete and continuous
Quantitative data splits into two types:
- Discrete data is counted and takes only particular values, usually whole numbers: the number of pets, goals scored, people in a car.
- Continuous data is measured and can take any value within a range: height, mass, time, temperature. Continuous values are limited only by the precision of the measuring instrument.
The classic exam trap is calling a count continuous. If you can have "half" of it meaningfully (half a second), it is continuous; if not (half a goal makes no sense), it is discrete.
Categorical, ordinal and other terms
- Categorical data sorts items into groups with no inherent order: eye colour, nationality, type of pet.
- Ordinal data is categorical but has a natural order: dress sizes (small, medium, large), exam grades, finishing positions.
- Raw data is data as first collected, before any processing.
- Ungrouped data lists individual values; grouped data is collected into class intervals.
- Bivariate data records two variables for each item (height and mass), which suits a scatter diagram. Multivariate data (Higher tier) records more than two.
Primary and secondary data
Primary data fits your exact question and is up to date, but it is slow and costly to gather. Secondary data is quick and cheap but may not match your question, may be out of date, and may contain errors. Edexcel stresses that sources of secondary data must always be acknowledged, and that you should consider the reliability and accuracy of the data (including rounding) and any constraints on accessing it.
Explanatory and response variables
In an investigation with two variables, the explanatory (independent) variable is the one thought to cause or explain change, and the response (dependent) variable is the one that responds. On a scatter diagram the explanatory variable goes on the axis and the response variable on the axis. For example, in "does fertiliser increase plant growth?", fertiliser is explanatory and growth is the response.
Grouping into class intervals
Grouping numerical data into class intervals (code 1b.02) makes a large data set manageable and easier to display, and is essential for histograms and grouped tables. The cost is a loss of accuracy: once values are grouped you no longer know the individual figures, so the mean and other statistics calculated from grouped data are estimates. You should know the term class width and be able to explain this trade-off.
Exam-style practice questions
Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Edexcel 1ST0 20194 marksFor each of the following, state whether the data is qualitative or quantitative, and if quantitative whether it is discrete or continuous. (a) The colour of cars in a car park. (b) The number of people in each car. (c) The mass of luggage in each car boot. (d) The make of each car.Show worked answer →
(a) Qualitative (it describes a quality, not a number).
(b) Quantitative and discrete (counted in whole numbers).
(c) Quantitative and continuous (measured, can take any value in a range).
(d) Qualitative (a category).
Markers reward each correct classification. A common slip is calling the number of people continuous; counts are always discrete.
Edexcel 1ST0 20223 marksA researcher records the heights of trees and groups them into class intervals of width metres. (a) State one advantage and one disadvantage of grouping the data. (b) State whether tree height is discrete or continuous, giving a reason.Show worked answer →
(a) Advantage: grouping makes a large data set easier to summarise and to display (for example in a histogram). Disadvantage: grouping loses accuracy because individual values are no longer known, so calculations such as the mean become estimates.
(b) Continuous, because height is measured and can take any value within a range (for example m), not just whole numbers.
Markers reward one valid advantage, one valid disadvantage, and the correct classification with a measurement-based reason.
Related dot points
- The statistical enquiry cycle: planning a hypothesis, recognising constraints, collecting, processing, interpreting and evaluating, with proactive strategies to manage problems.
A focused answer to Edexcel GCSE Statistics on the statistical enquiry cycle, covering the five stages, writing a testable hypothesis, recognising constraints such as time, cost and ethics, and planning proactive strategies to handle problems like non-response.
- Population, sampling frame and sample; simple random, systematic, stratified, quota, cluster, judgement and opportunity sampling; selecting random members; calculating strata sizes.
A focused answer to Edexcel GCSE Statistics on sampling, covering population, sampling frame and sample, simple random, systematic, stratified, quota, cluster, judgement and opportunity sampling, selecting random members electronically, and calculating stratified sample sizes.
- Sources of data, reliability and validity, designing questionnaires and data collection sheets, open and closed questions, leading questions, pilots, and cleaning data before processing.
A focused answer to Edexcel GCSE Statistics on collecting data and designing questionnaires, covering data sources, reliability and validity, open and closed questions, designing non-overlapping response boxes, spotting leading questions, pilots, and cleaning data before processing.
- Identifying and controlling extraneous variables, control groups and matched pairs, sources of bias, sensitivity of content, and the random response technique for sensitive questions.
A focused answer to Edexcel GCSE Statistics on controlling variables and bias, covering explanatory and extraneous variables, control groups and matched pairs, sources of bias, sensitivity of content, and the random response technique for sensitive questions at Higher tier.
Sources & how we know this
- Pearson Edexcel GCSE (9-1) Statistics (1ST0) specification — Pearson Edexcel (2017)