Lexical Coverage
Lexical coverage is the percentage of running word tokens in a text that a reader or listener already knows. If a learner knows 9,650 of the 10,000 running words in a chapter, the chapter offers 96.5% coverage for that learner. Coverage is a property of a text-learner pairing, not a property of the text alone.
The construct sits at the centre of materials development for graded input. Once a target coverage figure is set (say 98% for unassisted reading), designers can work backwards to ask which frequency band a text must stay within for a given learner population, and which word families need pre-teaching, glossing, or replacement.
Coverage vs comprehension
Coverage is not comprehension. Comprehension depends on coverage plus syntactic difficulty, background knowledge, propositional density, discourse organisation, and reader skill. Coverage sets a ceiling on the lexical contribution to comprehension, not a guarantee of it. The threshold literature treats coverage as a necessary condition that interacts with, but does not collapse into, overall comprehension.
Standard reference figures
Nation (2006), working with fourteen 1,000-word-family lists derived from the British National Corpus, calculated the vocabulary needed to reach 98% and 95% coverage across written and spoken English. The headline figures, computed over word families plus proper nouns and marginal words:
| Input type | 95% coverage | 98% coverage |
|---|---|---|
| Written text (novels, newspapers) | 4,000–5,000 word families | 8,000–9,000 word families |
| Spoken text (movies, friends) | 3,000 word families | 6,000–7,000 word families |
| Children's stories | 4,000 word families | 7,000 word families |
Two conventions matter when reading these figures. First, the unit is the word family (a headword plus its inflections and transparent derivations), not the word type. Second, proper nouns and marginal words (interjections, letters, exclamations) are typically counted as known, since learners decode them from context rather than the lexicon.
Where the figure goes
Coverage is the workhorse measure behind graded reader grading, lexical profile tools, pedagogic corpus design, and frequency-driven syllabus planning. Whether the right target is 95%, 98%, or a sliding linear function is the territory of the Hu and Nation threshold, the Laufer threshold, and Schmitt et al.'s linear-coverage counter-position.
References
- Hu, M., & Nation, P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403–430.
- Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners' vocabulary size and reading comprehension. Reading in a Foreign Language, 22(1), 15–30.
- Nation, I. S. P. (2001). Learning Vocabulary in Another Language. Cambridge University Press.
- Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? The Canadian Modern Language Review, 63(1), 59–82. https://doi.org/10.3138/cmlr.63.1.59
- Nation, I. S. P. (2013). Learning Vocabulary in Another Language (2nd ed.). Cambridge University Press.
- Schmitt, N., Jiang, X., & Grabe, W. (2011). The percentage of words known in a text and reading comprehension. The Modern Language Journal, 95(1), 26–43. https://doi.org/10.1111/j.1540-4781.2011.01146.x