Band Descriptors

Assessment

Band descriptors are the written descriptions that define what performance looks like at each level of a rating scale. They are the operational heart of any subjective assessment — without clear descriptors, a number on a scale is meaningless.

A band descriptor answers the question: What distinguishes a Band 6 from a Band 7? It must do so with enough precision that different raters, reading the same descriptor and evaluating the same performance, arrive at the same (or very similar) score.

Characteristics of Effective Descriptors

Quality	Description
Observable	Describes what can be seen or heard in the performance, not inferred mental states
Distinguishing	Each band is clearly different from its neighbours — no overlapping or vague boundaries
Positive	Describes what the learner can do at each level, not just deficiencies (especially at lower levels)
Unambiguous	Uses concrete, specific language — avoids terms like "adequate" or "reasonable" without definition
Comprehensive	Covers the full range of the scale without gaps
Calibrated	Reflects genuine differences in proficiency, supported by empirical data from sample performances

IELTS Band Descriptors: A Case Study

The IELTS band descriptors for Writing and Speaking use a 0–9 scale with four criteria (analytic scoring):

Writing Task 2:

Task Response
Coherence and Cohesion
Lexical Resource
Grammatical Range and Accuracy

Each criterion has descriptors for Bands 0–9. The descriptors are publicly available, which enables transparency but also invites formulaic preparation (a washback concern).

Key features of the IELTS descriptors:

They use qualifying language carefully: "a range of" vs "a wide range of" vs "a sophisticated range of"
Lower bands describe limitations; higher bands describe capabilities
The boundary between Band 6 and Band 7 is where descriptors shift from "adequate" performance to "good" performance — a critical threshold for many candidates

Writing Band Descriptors

When writing or revising band descriptors:

Start with anchor performances — collect samples at each level and describe what you see
Use consistent grammatical structure across bands for the same criterion
Scale features incrementally — quantity (few → some → many), quality (simple → varied → sophisticated), control (frequent errors → occasional errors → rare errors)
Avoid double-barrelled descriptors — "uses a range of vocabulary with few errors" conflates two features; separate them
Pilot with raters — have raters apply the descriptors to sample performances and identify where they disagree; revise accordingly
Include benchmark samples — descriptors alone are insufficient; raters need exemplar performances at each level

Relationship to Other Assessment Concepts

Rating Scale — the numerical scale; band descriptors give meaning to each point on it
Rubric — the complete scoring instrument, which includes the scale, descriptors, and instructions
Can-Do Statements — similar to band descriptors but oriented toward self-assessment and curriculum planning (e.g., CEFR)
Analytic vs Holistic Scoring — analytic scales have separate descriptors for each criterion; holistic scales have a single combined descriptor per band
Inter-rater Reliability — good descriptors directly improve rater agreement

Common Problems

Vague qualifiers — "good," "adequate," "some" without anchoring
Negative-only lower bands — describing Band 3 only as what the learner cannot do
Overlapping bands — Band 5 and Band 6 descriptions that could apply to the same performance
Too many features per band — raters cannot hold 8 features in mind simultaneously
Descriptors written by committee without empirical validation — resulting in bands that do not correspond to actual performance differences

Key References

North, B. (2000). The Development of a Common Framework Scale of Language Proficiency. Peter Lang.
Fulcher, G. (2003). Testing Second Language Speaking. Pearson.
Knoch, U. (2009). Diagnostic assessment of writing: A comparison of two rating scales. Language Testing, 26(2), 275–304.