Criterion-referenced Testing
Criterion-referenced tests (CRTs) measure performance against fixed, predetermined criteria — specific descriptions of what a learner can or cannot do. The question is not "How does this learner compare to others?" but "Can this learner do X?"
How It Works
Performance is judged against a standard, not against other test-takers. Every learner who meets the criteria passes; every learner who does not, fails. In theory, 100% of test-takers could pass (or fail).
Example: A criterion-referenced writing test might specify: The learner can write a formal email of complaint that includes a clear statement of the problem, supporting details, and a request for action, using appropriate register.
A learner either demonstrates this ability or does not. Their score is meaningful on its own — it does not need comparison to a peer group.
CRT vs NRT
| Feature | Criterion-referenced | Norm-referenced |
|---|---|---|
| Reference point | Fixed criteria/standards | Other test-takers |
| Score interpretation | "Can do X" / "Cannot do X" | "Better/worse than Y% of peers" |
| Score distribution | Not predetermined | Designed to spread (bell curve) |
| Purpose | Certification, mastery, achievement | Ranking, selection, placement |
| Item design | Items match learning objectives | Items designed to discriminate |
Where It Is Used
- Classroom achievement tests — Most teacher-made tests are (or should be) criterion-referenced: "Can learners use the past simple to narrate events?"
- CEFR-aligned assessments — The CEFR descriptors ("Can understand the main point of clear standard input on familiar matters") are criteria. Assessments aligned to CEFR levels are criterion-referenced.
- Competency-based programs — Pass/fail decisions based on demonstrated ability.
- Can-do checklists — Self-assessment and teacher assessment against specific skill descriptors.
Advantages
- Scores are directly interpretable — you know what the learner can actually do
- Aligns assessment with learning objectives (strong content validity)
- Supports Formative Assessment — criteria make it clear what to work on next
- Does not require a comparison group
Challenges
- Setting cut scores. Where is the line between pass and fail? This is a judgment call, and different judges may disagree.
- Defining criteria precisely. Vague criteria ("writes adequately") lead to unreliable assessment. Criteria must be specific and observable.
- Limited discrimination at the top. CRTs do not differentiate well among learners who all exceed the criteria. If the purpose is ranking or selection, Norm-referenced Testing is more appropriate.