Skip to main content
EnglandStatisticsSyllabus dot point

How do you use a sample to estimate features of the whole population?

Using summary statistics to estimate population characteristics; estimating the population mean from a sample; predicting population proportions; the effect of sample size on reliability and replication.

A focused answer to Edexcel GCSE Statistics on statistical inference, covering using summary statistics to estimate population characteristics, estimating the population mean from a sample, predicting population proportions, and how sample size affects reliability and replication.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Using a sample to estimate the population
  3. Estimating the population mean
  4. Predicting population proportions
  5. Sample size, reliability and replication
  6. Estimating other characteristics

What this dot point is asking

Edexcel codes 2h.01 and 2h.03 require you to use summary statistics from a sample to estimate population characteristics, in particular to use a sample mean to estimate the population mean and to predict population proportions, and to know that sample size affects the reliability and replication of results. This is the heart of statistical inference: drawing conclusions about a whole population from a sample.

Using a sample to estimate the population

A well-chosen sample stands in for the population, so its summary statistics become estimates of the population's. Edexcel expects you to recognise, for example, that approximately half the population is expected to lie above the sample median, because the median splits the data in two. The reliability of any such estimate depends entirely on the sample being representative.

Estimating the population mean

The sample mean is the natural estimate of the population mean. If a sample of 5050 components has a mean mass of 1212 g, you estimate the mean mass of all components as 1212 g, and you can scale up to a total (for 2000020000 components, an estimated total of 12×20000=240,00012 \times 20000 = 240{,}000 g). This works because, for a random sample, the sample mean is centred on the population mean; it will not be exactly right, but it is the best single estimate.

Predicting population proportions

A sample proportion estimates the population proportion. If 130130 out of 200200 sampled voters say Yes, the estimated proportion is 130200=0.65\frac{130}{200} = 0.65, and you predict that about 65%65\% of all voters would say Yes. To estimate a count in the population, multiply the proportion by the population size: 0.65×50000=32,5000.65 \times 50000 = 32{,}500. This scaling-up is one of the most common inference tasks in the exam.

Sample size, reliability and replication

Code 2h.03 stresses that conclusions based on larger samples are generally more reliable, because random sampling variation has less effect on a big sample, so a repeat study is more likely to give a similar result. However, sample size does not cure bias: a large but unrepresentative sample (from a poor frame or a non-random method) still gives a biased estimate. So both a good method and an adequate size are needed.

This links directly to the quality-assurance idea that sample means vary less than individual values: averaging over more data smooths out the extremes, so a mean based on a larger sample is a tighter, more trustworthy estimate of the population mean. It is also why a single small sample should be treated with caution, and why repeating a study (replication) and getting a consistent answer strengthens confidence in the conclusion.

Estimating other characteristics

The same logic extends beyond the mean and a single proportion. A sample can be used to estimate the population median (about half the population lies above the sample median), the population range or spread, and the frequency of any category. In each case you treat the sample statistic as the best estimate of the matching population value, and where appropriate scale it up using the population size. Edexcel may give you a sample summary (a mean, a median, a set of class frequencies) and ask you to make a statement about the whole population, so practise turning a sample figure into a population estimate and stating clearly that it is an estimate, not an exact value.

Exam-style practice questions

Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Edexcel 1ST0 20204 marksA sample of 5050 light bulbs has a mean lifetime of 12001200 hours. The factory produces 2000020000 bulbs. (a) Estimate the total number of bulb-hours the factory's output will last. (b) State one way the estimate could be made more reliable.
Show worked answer →

(a) Use the sample mean as an estimate of the population mean (12001200 hours per bulb). Total bulb-hours 1200×20000=24,000,000\approx 1200 \times 20000 = 24{,}000{,}000 hours.

(b) Take a larger sample (or several samples), because a larger sample gives a more reliable estimate of the population mean and reduces the effect of random variation.

Markers reward using the sample mean for the population, the total 2424 million hours, and a valid way to improve reliability (larger sample).

Edexcel 1ST0 20223 marksIn a random sample of 200200 voters, 130130 said they would vote Yes in a referendum. (a) Estimate the proportion of all voters who would vote Yes. (b) If the electorate is 5000050000, estimate the number who would vote Yes.
Show worked answer →

(a) Estimated proportion =130200=0.65= \frac{130}{200} = 0.65, so about 65%65\% would vote Yes.

(b) Apply the proportion to the population: 0.65×50000=32,5000.65 \times 50000 = 32{,}500 voters.

Markers reward the sample proportion 0.650.65 and scaling it up to estimate 32,50032{,}500 for the population.

Related dot points

Sources & how we know this