EnglandPsychologySyllabus dot point

How do psychologists decide whether a result is significant, and how do they analyse qualitative data?

Inferential statistics and qualitative analysis: probability and significance, the five Edexcel inferential tests, Type 1 and Type 2 errors, the normal and skewed distributions, and the analysis of qualitative data through thematic analysis and grounded theory.

An Edexcel A-Level Psychology answer to Paper 3 data analysis, covering probability and significance at p < 0.05, the five inferential tests (sign, Wilcoxon, Mann-Whitney, Spearman, chi-squared), Type 1 and Type 2 errors, the normal and skewed distributions, and qualitative analysis through thematic analysis and grounded theory.

Generated by Claude Opus 4.815 min answerUpdated 2026-06-14

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Quick answer

Psychology judges a result significant at $p < 0.05$ (under a 5 per cent chance of being a fluke) by comparing a calculated value with a critical value. Edexcel's five tests are chosen by level of measurement, difference-versus-correlation and related-versus-unrelated design: sign test (related, nominal), Wilcoxon (related, ordinal), Mann-Whitney (unrelated, ordinal), Spearman's rho (correlation, ordinal) and chi-squared (unrelated, nominal). The sign, Wilcoxon and Mann-Whitney tests need the calculated value equal to or below the critical value; chi-squared and Spearman need it equal to or above. A Type 1 error is a false positive, a Type 2 error a false negative, traded off by the significance level. The normal distribution is symmetrical (mean = median = mode); skewed data are better summarised by the median. Qualitative data are analysed by thematic analysis (coding into themes) and grounded theory (building theory inductively to saturation).

Jump to a section

What this dot point is asking
The answer
Examples in context
Try this

What this dot point is asking

Paper 3 (Psychological Skills) requires you to handle data: to explain probability and significance, choose and interpret the correct inferential test, recognise Type 1 and Type 2 errors, read the normal and skewed distributions, and analyse qualitative data through thematic analysis and grounded theory. These are synoptic skills tested across novel scenarios drawn from any topic.

The answer

Probability and significance

A null hypothesis predicts no difference or no relationship; the alternative (experimental) hypothesis predicts one. An inferential test produces a calculated value, which is compared with a critical value from a table (using $N$ , whether the test is one- or two-tailed, and the significance level). If the result passes the test, we reject the null hypothesis. A one-tailed hypothesis predicts the direction of the effect; a two-tailed hypothesis does not, and uses a more conservative critical value.

Choosing the inferential test

Test	Looking for	Design	Data
Sign test	Difference	Related	Nominal
Wilcoxon signed-rank	Difference	Related	Ordinal
Mann-Whitney U	Difference	Unrelated	Ordinal
Spearman's rho	Correlation	(pairs of scores)	Ordinal
Chi-squared ( $\chi^2$ )	Difference / association	Unrelated	Nominal

For the sign test, Wilcoxon and Mann-Whitney, the result is significant when the calculated value is equal to or less than the critical value. For chi-squared and Spearman, the result is significant when the calculated value is equal to or greater than the critical value. Mixing up these two rules is the most common error.

Type 1 and Type 2 errors

The significance level controls the trade-off. A lenient level (for example $p < 0.10$ ) makes a Type 1 error more likely; a strict level (for example $p < 0.01$ ) reduces Type 1 risk but raises Type 2 risk. The conventional $p < 0.05$ balances the two.

The normal and skewed distributions

The normal distribution is a symmetrical, bell-shaped curve where the mean, median and mode coincide at the centre, with most scores clustered around the mean and the tails thinning symmetrically. Many psychological variables (IQ, reaction time) approximate it.

A distribution is skewed when scores bunch at one end:

Positive (right) skew. A long tail to the right (high scores). The mode is lowest, the mean is pulled highest by the tail, so $\text{mode} < \text{median} < \text{mean}$ .
Negative (left) skew. A long tail to the left (low scores), so $\text{mean} < \text{median} < \text{mode}$ .

Skew matters because the mean is distorted by extreme scores, so the median is a better measure of central tendency for skewed data.

Analysis of qualitative data

Qualitative data (interviews, open questions) are analysed for meaning rather than counted:

Thematic analysis identifies patterns of meaning. The researcher reads the data, assigns codes to segments, then groups codes into broader themes that capture something important about the data, illustrated with quotations. It is flexible and rich but can be subjective, so reliability is checked through a second coder.
Grounded theory builds theory inductively from the data rather than testing a pre-set hypothesis. Through constant comparison (comparing each new piece of data with earlier codes) and successive coding, categories emerge until theoretical saturation (no new categories appear), at which point a theory grounded in the data is proposed.

Evaluation (GRAVE)

Generalisability. Inferential statistics let researchers generalise from a sample to a population, but only if sampling was sound.
Reliability. Standardised tests and decision rules make quantitative analysis highly reliable and replicable; qualitative analysis needs inter-rater checks to be reliable.
Application. Choosing the right test underpins every Paper 1 and Paper 2 data question and real published research.
Validity. Thematic analysis and grounded theory preserve the richness and meaning of data, raising validity where reducing experience to numbers would lose it.
Ethics. Researchers must not "p-hack" (run many tests to find a significant one), which inflates Type 1 errors and is a form of misreporting.

Selecting a test for a sign-test scenario

A therapist rates 12 clients' anxiety as "better" or "worse" after a course of CBT (the same clients before and after).

step 1 Identify the design and data

The same clients are measured twice (related design), and the outcome is a category, better or worse (nominal data), testing for a difference.

step 2 Select the test

Related design, nominal data, test of difference: the Edexcel test is the sign test.

step 3 Calculate S

Count the less frequent sign. Suppose 10 improved (plus) and 2 worsened (minus); ignore any "no change". The calculated value $S$ is the number of less frequent signs, so $S = 2$ , with $N = 12$ .

step 4 Decide significance

Compare $S$ with the critical value for $N = 12$ at $p < 0.05$ . The sign test is significant when the calculated value is equal to or less than the critical value (for $N = 12$ , two-tailed, the critical value is $2$ ). Since $S = 2 \leq 2$ , the result is significant: CBT produced a significant improvement.

Examples in context

Example 1. Reading reaction-time data that are skewed. Reaction times usually show positive skew, because a few slow trials create a long right tail. Reporting the mean would overstate the typical response time, so a psychologist reports the median instead, and may treat the data as ordinal, pushing the analysis towards a non-parametric test such as Mann-Whitney. This shows how the shape of the distribution drives both the descriptive measure and the inferential test.

Example 2. Thematic analysis of interviews about exam stress. A researcher interviews 15 students, codes phrases such as "I couldn't sleep" and "my mind went blank", then groups them into themes like "physical symptoms" and "cognitive disruption", illustrating each with a quotation. A second researcher codes a sample independently to check agreement. The output is a structured account of how students experience stress that a questionnaire's numbers could not capture, demonstrating the value (and the subjectivity) of qualitative analysis.

Try this

Q1. State the significance level used in psychology and explain what it means. [2 marks]

Cue. $p < 0.05$ : there is less than a 5 per cent probability the result is due to chance if the null hypothesis is true.

Q2. A study correlates two sets of ranked scores. Name and justify the inferential test. [3 marks]

Cue. Spearman's rho, because the data are ordinal and the aim is to measure a relationship (correlation) rather than a difference.

Q3. Outline the difference between thematic analysis and grounded theory. [4 marks]

Cue. Both code qualitative data, but thematic analysis identifies patterns (themes) within data, while grounded theory builds a new theory inductively from the data through constant comparison until saturation.

Exam-style practice questions

Practice questions written in the style of Pearson Edexcel exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Edexcel style6 marksA researcher uses a repeated measures design and ordinal data to test for a difference between two conditions. Identify the appropriate inferential test, justify your choice, and explain how to decide whether the result is significant. [6 marks]

Show worked answer →

Work from level of measurement and design to the test, then state the decision rule.

Test: Wilcoxon signed-rank test. Justification: the design is related (repeated measures, the same participants in both conditions), the data are ordinal, and the aim is to test for a difference. Wilcoxon is the Edexcel test of difference for a related design with ordinal data (Mann-Whitney would be wrong because it is for unrelated designs).

Deciding significance: compare the calculated value (T) with the critical value from the table for the relevant N and significance level. For Wilcoxon, like Mann-Whitney, the result is significant if the calculated value is equal to or less than the critical value at p < 0.05. If it is, reject the null hypothesis and accept a significant difference; if not, retain the null hypothesis.

Markers reward the correct named test, a justification using level of measurement and related-or-unrelated design, and the correct decision rule (calculated equal to or less than critical for Wilcoxon).

Edexcel style4 marksExplain what is meant by a Type 1 error and a Type 2 error in inferential testing. [4 marks]

Show worked answer →

Define each error precisely and link it to the significance level.

A Type 1 error is a false positive: rejecting the null hypothesis when it is actually true, so the researcher claims a significant effect that does not really exist. Using a lenient significance level such as p < 0.10 raises the risk of a Type 1 error.

A Type 2 error is a false negative: retaining the null hypothesis when it is actually false, so the researcher misses a real effect. Using a very strict level such as p < 0.01 lowers the Type 1 risk but raises the Type 2 risk.

The conventional p < 0.05 is a compromise that balances the two. Markers reward a correct definition of each error and the point that the choice of significance level trades one risk against the other.

Related dot points

Sources & how we know this

Pearson Edexcel A-Level Psychology (9PS0) specification — Pearson Edexcel (2015)