Standardization
Standardization is the process of training and calibrating examiners/raters so they apply a rating scale consistently. Without it, the same performance can receive different scores from different raters, undermining inter-rater reliability and, by extension, test validity.
The Process
A typical standardization session follows this sequence:
- Familiarization: raters study the rating scale descriptors and assessment criteria
- Benchmarking: raters score sample performances that have been pre-rated by senior examiners; discrepancies are discussed
- Practice rating: raters score additional samples independently, then compare and reconcile
- Certification: raters must achieve acceptable agreement levels (often measured by exact or adjacent agreement rates, or correlation coefficients) before they are approved to rate live assessments
Key Concepts
- Rater severity/leniency: individual raters tend to be systematically harsh or generous; standardization aims to narrow this range
- Rater drift: even trained raters become less consistent over time, requiring re-standardization (IELTS examiners are re-certified regularly)
- Multi-faceted Rasch measurement (Linacre 1989): a statistical approach that models rater severity as a measurable facet alongside candidate ability and task difficulty, enabling post-hoc adjustment
IELTS as a Case Study
IELTS employs one of the most rigorous standardization systems in language testing. Examiners undergo initial certification training, are monitored through recorded assessments, and must pass re-certification. Double marking is used for Writing. This infrastructure is what allows scores from different test centers worldwide to be meaningfully compared: the scale means the same thing regardless of who rates it.
Practical Implication
Any institution using subjective assessment (speaking, writing) needs some form of standardization. Even informal moderation meetings, where teachers score the same student work and discuss differences, substantially improve scoring consistency.