Skip to main content
ScotlandStatistics

Data Analysis and Modelling: study guide to the SQA Advanced Higher Statistics first area

A study guide to the first area of SQA Advanced Higher Statistics, Data Analysis and Modelling. Covers experimental design, exploratory data analysis, probability, discrete and continuous random variables and bivariate regression, with advice on how the topics connect and how to study them.

Generated by Claude Opus 4.89 min readAdvanced Higher: Data Analysis and Modelling

Reviewed by: AI editorial process; not yet individually human-reviewed

Jump to a section
  1. What the area covers
  2. How the topics connect
  3. How to study this area
  4. Where to go next

Data Analysis and Modelling is the first of the three areas of SQA Advanced Higher Statistics and the foundation for everything that follows. It teaches you to collect data well, to summarise and picture it, to reason about chance, and to model variables with the standard probability distributions. This guide maps the area and links to the full topic pages.

What the area covers

The area moves from planning a study, through describing data, to modelling it with probability.

  • Experimental design and data collection. Observational studies versus designed experiments, control, randomisation, replication and blocking, and the sources of bias that invalidate conclusions.
  • Exploratory data analysis. Measures of location and dispersion (mean, median, quartiles, IQR, variance and standard deviation), stem-and-leaf plots, boxplots, outliers and skewness.
  • Probability. The addition and multiplication laws, conditional probability, independence and mutual exclusivity, tree diagrams, the total probability rule and Bayes' theorem.
  • Discrete random variables. Expectation and variance, the laws of expectation and variance, and the binomial, Poisson and geometric models.
  • Continuous random variables and the normal distribution. Standardising to find probabilities, combining independent normals, and the normal approximation to the binomial and Poisson with a continuity correction.
  • Bivariate data and regression. Scatter plots, the sums of squares and products, the product-moment correlation coefficient, the least-squares regression line and residual plots.

How the topics connect

The topics are layered. Experimental design decides whether the data can answer the question at all, and exploratory data analysis then reveals its centre, spread and shape, which is the diagnostic that tells you whether a normal model is reasonable. Probability is the engine underneath the random variables: the binomial, Poisson and normal distributions are all probability models, and expectation and variance describe them. The normal distribution links back to the discrete models through the normal approximation, and forward to the inference area through the distribution of the sample mean. Bivariate regression reuses the sums of squares idea that reappears when correlation and regression are tested for significance in the hypothesis-testing area. Treat the six pages as one connected toolkit.

How to study this area

  1. Get the vocabulary exact. Design and bias questions are marked on precise terms (control, randomisation, replication, blocking, confounding), so learn the definitions.
  2. Drill the standard-deviation and sums-of-squares calculations. They reappear constantly, so make s2=(xxˉ)2n1s^2=\dfrac{\sum (x-\bar{x})^2}{n-1} and the SxxS_{xx}, SxyS_{xy} formulae automatic.
  3. Practise standardising both ways. Be fluent at Z=XμσZ=\dfrac{X-\mu}{\sigma} to find a probability and at X=μ+zσX=\mu+z\sigma to find a value.
  4. Always sketch before naming. A quick sketch of a tail or a scatter plot stops sign and skew errors.
  5. Use the data booklet. Know which formulae and tables are provided so you spend exam time on method, not recall.

Where to go next

Work through the six topic pages from this area, then test yourself with the area quiz. After that, move on to the Statistical Inference area, which uses these distributions to estimate population values from samples.

Sources & how we know this

  • statistics
  • sqa-advanced-higher
  • sqa-statistics
  • data-analysis-and-modelling
  • advanced-higher
  • probability
  • random-variables