Moderation
Moderation is the quality assurance process of checking, adjusting, and calibrating marking across different raters, markers, sites, or institutions to ensure fairness and consistency. Where rater training prepares examiners before assessment, moderation monitors and corrects after or during the marking process.
The fundamental question moderation answers: Would this candidate receive the same score regardless of who marked their work?
Types of Moderation
Statistical Moderation
Scores from different raters or groups are adjusted statistically to compensate for systematic differences:
- A rater whose mean score is 0.5 bands below the overall mean is identified as severe; scores are adjusted upward
- A school whose internal assessment results are consistently higher than external exam results has its scores scaled down
Statistical moderation preserves the rank order of candidates within each group but adjusts levels to a common standard. It requires sufficient data — small sample sizes make statistical adjustment unreliable.
Social Moderation (Consensus Moderation)
Raters meet to review and discuss sample scripts/performances together. They:
- Score samples independently
- Compare scores and discuss discrepancies
- Reach consensus on appropriate scores
- Agree on standards for borderline cases
This is the most common form in institutional contexts and closely resembles standardisation meetings used in rater training.
Expert Moderation
A senior examiner or team leader reviews a sample of marked work to verify that the rater has applied the rating scale correctly. If systematic issues are found (e.g., a rater consistently ignoring one criterion), the rater receives feedback and may have their marks adjusted.
Moderation in Practice
| Context | Moderation approach |
|---|---|
| IELTS Writing | Double-marking for a proportion of scripts; statistical monitoring of each examiner's patterns |
| Cambridge exams | Team leader reviews sample of each examiner's marking; statistical post-hoc analysis |
| University departmental essays | Social moderation meetings; blind second-marking of borderline scripts |
| Language school end-of-course tests | Team discussion of anchor scripts; spot-checking across markers |
Key Principles
- Blind marking first — Raters should score independently before seeing others' judgments; discussion comes after
- Use anchor scripts — Pre-agreed exemplar performances at each level provide a shared reference point
- Focus on the criteria — Moderation discussions should be grounded in band descriptors and rubric language, not personal preferences
- Act on findings — Moderation that identifies problems but does not lead to score adjustment or further training is performative
- Document decisions — Record the rationale for any score changes for transparency and accountability
Moderation vs Standardisation
The terms are sometimes used interchangeably, but a useful distinction:
- Standardisation = establishing shared standards before marking (proactive)
- Moderation = checking that those standards were applied during and after marking (reactive)
Both are needed. Standardisation without moderation assumes training was perfectly effective. Moderation without standardisation has nothing to moderate against.
Why It Matters
Without moderation:
- Rater severity/leniency differences directly affect candidates' scores and life outcomes
- Raters who drift from standards over time are not identified
- Institutional certificates and grades lose credibility
- Inter-rater reliability is assumed rather than demonstrated
Key References
- Weigle, S. C. (2002). Assessing Writing. Cambridge University Press.
- Lumley, T. (2005). Assessing Second Language Writing: The Rater's Perspective. Peter Lang.
- Wyatt-Smith, C., Klenowski, V. & Gunn, S. (2010). The centrality of teachers' judgement practice in assessment. Assessment in Education, 17(1), 59–75.
See Also
- Inter-rater Reliability — moderation aims to maintain and improve it
- Rater Training — the proactive complement to moderation
- Standardization — establishing the standards that moderation checks against
- Band Descriptors — the reference point for moderation discussions