Pre-test Post-test Design

research-methodology

A research design in which participants are measured on the dependent variable before (pre-test) and after (post-test) a treatment or intervention. The difference between pre-test and post-test scores provides evidence of change. This is the default measurement structure in SLA intervention research, underpinning both true experimental and quasi-experimental studies.

Variations

One-group pre-test post-test

A single group is tested before and after treatment. No comparison group. This is the weakest design — any improvement could be due to maturation, testing effects, history, or regression to the mean rather than the treatment itself.

Two-group pre-test post-test

A treatment group and a control/comparison group, both pre- and post-tested. The standard quasi-experimental arrangement. Comparing group gains controls for maturation and history (assuming both groups experience them equally).

Solomon four-group design

Developed by Solomon (1949). Four groups:

Group	Pre-test	Treatment	Post-test
1	Yes	Yes	Yes
2	Yes	No	Yes
3	No	Yes	Yes
4	No	No	Yes

The design controls for pre-test sensitisation — the possibility that taking the pre-test itself changes how participants respond to the treatment or post-test. If Groups 1 and 3 show different post-test results, the pre-test interacted with the treatment. Powerful but resource-intensive: it requires four times the sample of a one-group design. Rarely used in SLA research due to the large participant numbers required.

Delayed Post-tests

A single post-test immediately after treatment shows whether learning occurred but not whether it endured. Delayed post-tests (administered days, weeks, or months later) assess durability. This matters especially for distinguishing genuine acquisition from short-term performance gains.

Norris & Ortega (2000) found that studies with delayed post-tests showed smaller effect sizes than those with immediate post-tests only, suggesting some reported gains reflect temporary rather than lasting learning.

Analysing Pre-Post Data

Gain scores

Simple difference: post-test minus pre-test. Intuitive but statistically problematic — gain scores are unreliable because they correlate with pre-test scores (students who start lower have more room to improve).

ANCOVA (Analysis of Covariance)

Uses pre-test scores as a covariate to statistically adjust for initial group differences. Preferred over gain scores because it controls for regression to the mean and handles non-equivalent groups more effectively. The standard recommendation for quasi-experimental classroom studies.

Repeated-measures ANOVA

Treats time (pre, post, delayed post) as a within-subjects factor. Appropriate when the same participants provide data at multiple time points.

Common Threats

Testing effect — exposure to the pre-test improves post-test performance regardless of treatment
Regression to the mean — extreme pre-test scores naturally move toward the average on re-testing
Attrition — participants who drop out may differ systematically from those who remain
Practice-Test Congruency — treatment tasks resembling the test inflate apparent gains

Key References

Campbell & Stanley (1963) — original design taxonomy
Solomon (1949) — the four-group design
Norris & Ortega (2000) — meta-analysis highlighting the role of delayed post-tests
Plonsky & Oswald (2014) — field-specific Effect Size benchmarks for SLA