What make a study reliable and valid, and how can each be assessed and improved?
Reliability (internal and external; test-retest, inter-observer; how to assess and improve it) and validity (internal and external; face, concurrent, ecological, temporal and population validity; demand characteristics and investigator effects; how to assess and improve it).
An Eduqas A-Level Psychology answer to reliability and validity in Component 2. Covers internal and external reliability, test-retest and inter-observer reliability, internal and external validity, face, concurrent, ecological, temporal and population validity, demand characteristics and investigator effects, and how to assess and improve each.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Component 2 requires you to understand reliability and validity: what each means, the types of each, the threats (demand characteristics, investigator effects), and how to assess and improve them.
The answer
Reliability
Validity
Improving validity
Control confounds (standardisation, single-blind and double-blind procedures), use covert or naturalistic observation to reduce demand characteristics, use realistic tasks and settings for ecological validity, and check the measure against an established one (concurrent validity).
Examples in context
Example 1. Inter-observer reliability in practice. In an observation of aggression, two observers independently tally the same behavioural categories; if their tallies correlate at about or above, the observation is reliable. This shows how reliability is quantified and improved by clear categories and training.
Example 2. Ecological validity and lab tasks. Memorising word lists in a lab (as in Bartlett's contemporaries) may not reflect everyday memory, lowering ecological validity. Using realistic material and settings raises it, illustrating the trade-off between control and real-world relevance.
Try this
Q1. Define reliability and validity in one sentence each. [2 marks]
- Cue. Reliability is the consistency of a measure (the same results each time); validity is the accuracy of a measure (it measures what it claims to).
Q2. Explain how inter-observer reliability is assessed. [2 marks]
- Cue. Two or more observers independently record the same behaviour using the same categories, and their results are correlated; a coefficient of about or above indicates good reliability.
Q3. Explain what demand characteristics are and how to reduce them. [2 marks]
- Cue. Demand characteristics are cues that let participants guess the aim and change their behaviour; they can be reduced by single-blind procedures, cover stories or covert observation.
Exam-style practice questions
Practice questions written in the style of WJEC Eduqas exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Eduqas 20196 marksExplain the difference between reliability and validity, using an example. [6 marks]Show worked answer →
A knowledge item (AO1) with an example.
Reliability is consistency: whether a measure produces the same results each time it is used under the same conditions. Validity is accuracy: whether a measure actually measures what it claims to measure.
Example: a set of scales that always reads 2 kg too heavy is reliable (consistent every time) but not valid (it does not give the true weight). A psychological test could give the same score on two occasions (reliable) but still fail to measure the trait it claims to (low validity). A measure can be reliable without being valid, but to be valid it must first be reliable.
Markers reward clear definitions of consistency versus accuracy and an example showing reliability without validity.
Eduqas 20218 marksExplain how the reliability and the validity of an observation could be improved. [8 marks]Show worked answer →
An application item (AO2/AO3).
Reliability: use clear, operationalised behavioural categories so observers record the same events; train observers; use two or more observers and check inter-observer reliability (a correlation of about or above indicates good agreement); standardise the procedure.
Validity: use covert observation so participants behave naturally (reducing demand characteristics); observe in a naturalistic setting to raise ecological validity; ensure the behavioural categories genuinely capture the target behaviour (face/content validity); and check against another measure (concurrent validity).
Markers reward practical, named techniques for each (inter-observer reliability and operationalised categories for reliability; covert/naturalistic observation and valid categories for validity).
Related dot points
- The experimental method: types of experiment (laboratory, field, natural, quasi), independent and dependent variables and operationalisation, hypotheses, extraneous and confounding variables and controls, and experimental designs (independent groups, repeated measures, matched pairs).
An Eduqas A-Level Psychology answer to the experimental method in Component 2. Covers laboratory, field, natural and quasi experiments, independent and dependent variables, operationalisation, hypotheses, extraneous and confounding variables, controls, and the three experimental designs with their strengths and weaknesses.
- Non-experimental methods: observation (naturalistic, controlled, participant, non-participant, overt, covert; behavioural categories and sampling) and self-report (questionnaires and interviews; open and closed questions; designing good questions).
An Eduqas A-Level Psychology answer to observation and self-report methods in Component 2. Covers types of observation, behavioural categories, event and time sampling, questionnaires and interviews, open and closed questions, and the strengths and weaknesses of each method.
- Sampling (target population, sample, random, opportunity, volunteer, systematic and stratified sampling; bias and generalisability) and ethics (the BPS principles: informed consent, deception, right to withdraw, protection from harm, confidentiality, and dealing with ethical issues).
An Eduqas A-Level Psychology answer to sampling and ethics in Component 2. Covers target populations and samples, random, opportunity, volunteer, systematic and stratified sampling, sampling bias and generalisability, and the BPS ethical principles with ways of dealing with ethical issues.
- Descriptive statistics: measures of central tendency (mean, median, mode), measures of dispersion (range, standard deviation), levels of measurement (nominal, ordinal, interval), percentages and ratios, and presenting data (tables, bar charts, histograms, scattergrams).
An Eduqas A-Level Psychology answer to descriptive statistics in Component 2. Covers the mean, median and mode, range and standard deviation, levels of measurement, percentages and ratios, and how to present quantitative data in tables, bar charts, histograms and scattergrams, with worked calculations.
- Inferential statistics: probability and significance (), the null and alternative hypotheses, choosing the correct test (the binomial sign test, Mann-Whitney U, Wilcoxon, Spearman's rho, chi-square) from design and level of measurement, observed versus critical values, and Type I and Type II errors.
An Eduqas A-Level Psychology answer to inferential statistics in Component 2. Covers probability and the 0.05 significance level, the null hypothesis, how to choose between the binomial sign test, Mann-Whitney U, Wilcoxon, Spearman's rho and chi-square, comparing observed and critical values, and Type I and Type II errors.
Sources & how we know this
- Eduqas GCE A Level in Psychology (A290) specification — Eduqas (2015)