Graded Reader Construction

MethodologyLanguage Analysis

How publishers build graded readers: the craft decisions behind vocabulary control, grammar grading, readability measurement, and the treatment of unknown words.

The Core Vocabulary System

Headwords vs. Lemmas vs. Word Families

These three concepts are frequently conflated but operate differently in reader design.

Token: any running word in the text. "run", "runs", "running", "ran" = 4 tokens.

Lemma: the base form + its inflections only. "run" covers runs/running/ran = 1 lemma, 4 tokens.

Word family: lemma + all derived forms. "run" covers run/runs/ran/running/runner/runners/runnable = 1 family, many forms. Nation (2016) defines six levels of family breadth, from base-only (Level 1) to all infrequent derivations (Level 6). Research and the AWL use Level 6.

Headword: publisher terminology for a counted unit. Oxford Bookworms and most British publishers use headwords defined approximately as word families (covering inflections + common derivations). So "run" at 400 headwords means a learner needs to recognise runner, running, and ran, but publishers vary in how strictly they apply this.

Key implication for graded reader authors: a 400-headword level does not mean 400 tokens. Each headword represents a family cluster. The author controls the number of headword families deployed, not the raw token count.

Headword Lists by Publisher

Publisher	Series	Levels	Headword Range	CEFR Approx
Oxford	Oxford Bookworms Library	Starter–Stage 6	250–2,500	A1–B2
Penguin/Pearson	Penguin Readers	Starter–Level 7	200–3,000	Pre-A1–C1
Cambridge	Cambridge English Readers	Starter–Level 6	250–3,800	A1–C1
Macmillan	Macmillan Readers	Starter–Upper	300–2,200	A1–B2
National Geographic	Footprint Reading Library	8 levels	800–3,000	A2–B2

Note: Claridge (2012) found significant inter-series inconsistency. A Bookworms text at 1,400 word families is labelled B1/B2; a Macmillan text at the same count is labelled A2/B1. These mismatches matter when building multi-series libraries.

The NGSL-GR: A Modern Alternative

The New General Service List, Graded Reader edition (NGSL-GR 1.0), was designed specifically for graded reader production. It has 11 bands in 400-word increments (bands 1–8) then 600-word increments (bands 9–11), providing finer-grained control than traditional publisher levels. A text at NGSL-GR Band 4 has 1,600 high-frequency words available. This list is corpus-derived from a 273-million-word corpus and is increasingly used by researchers for profiling reader vocabulary.

Vocabulary Control Techniques

The 98% Coverage Threshold

Nation (2001) and Hu & Nation (2000) established the foundational finding: a reader needs to know approximately 98% of running words for independent, pleasurable comprehension. At 95% (1 unknown in 20), comprehension becomes effortful and acquisition drops sharply.

Implications:

A 400-headword text that introduces 50 new words must recycle those words heavily so that any given page maintains 98% known coverage
Coverage ≠ comprehension; a learner may know 98% of words but fail to comprehend due to syntactic complexity or background knowledge gaps
The 98% figure applies to word families, not raw tokens; knowing "run" gives coverage for "runner" at most proficiency levels

Coverage thresholds for vocabulary size at 98%:

Text type	Vocabulary needed (word families)
Graded readers (controlled)	300–2,500 depending on level
General fiction	~8,000–9,000 (Nation, 2006)
Newspapers	~9,000–10,000
Academic text	~8,000 + AWL

Introducing, Recycling, and Glossing New Words

Introduction: New headwords in well-designed graded readers appear first in a high-support context. The meaning is recoverable from surrounding text (semantic transparency), illustration, or a glossary entry. Cold introduction (using a word without contextual scaffolding) is an authoring error.

Recycling: Nation's research suggests a word needs approximately 10–15 meaningful encounters for productive acquisition. For graded readers, this means a new headword introduced at page 1 should reappear naturally 8–12 more times by the end. Waring & Takaki (2003) confirmed that single-encounter learning from graded readers is modest; repeated encounters drive retention.

Glossing: Three mechanisms:

Marginal/interlinear gloss: brief L1 or L2 definition in the margin. Research (Hulstijn, Hollander & Greidanus, 1996) shows glosses increase noticing and short-term retention but may reduce reading flow.
End-of-chapter glossary: lower interference with reading flow; learners must actively retrieve.
Running footnote: used by Oxford Bookworms for cultural and proper-noun items outside the headword list.

The Claridge (2012) study noted that Cambridge English Readers deliberately exclude glossaries and support materials, positioning their texts as adult leisure reading. Oxford and Penguin include notes and glossaries. Neither approach is demonstrably superior for acquisition; the choice reflects design philosophy.

Handling Proper Nouns, Names, and Cultural References

Proper nouns present a structural problem: character names, place names, brand names, and cultural references fall outside any headword list but are unavoidable in narrative. Publisher practice varies:

Not counted: Most publishers do not count proper nouns in the headword total. "London", "Maria", "Toyota" are treated as transparent additions.
Glossed: Culturally opaque references (e.g., "the National Health Service" in a British story) receive footnotes or brief in-text definition.
Adapted: In simplified versions of classics, culturally embedded references may be rewritten or modernised. Oxford Bookworms Guidelines require that cultural references be explained or replaced if they are likely to be opaque to international audiences.
Invented names: Original graded reader authors often choose phonologically simple, internationally recognisable names (Anna, Tom, Carlos) to reduce decoding load for diverse learners.

Grammar Grading

Grammar is graded in parallel with vocabulary. Each publisher uses a grammar syllabus that defines which structures are permissible at each level. The Oxford Bookworms graded grammar syllabus (publicly available) is the most widely cited model:

Oxford Bookworms Grammar Progression

Stage	CEFR	Headwords	Key Grammar
Starter	A1	250	Present simple, past simple, basic modals (can/can't), imperatives, simple coordination
Stage 1	A1–A2	400	Past simple, coordination (and/but/or), subordination (before, after, when, because, so)
Stage 2	A2–B1	700	Present perfect, will (future), have to/must/could, comparative adjectives, simple if-clauses, past continuous, tag questions, ask/tell + infinitive
Stage 3	B1	1,000	Should/may, present perfect continuous, used to, past perfect, causative, relative clauses
Stage 4	B1–B2	1,400	Future perfect/continuous, past modals (might have, should have), more complex passives, reported speech
Stage 5	B2	1,800	Full range of conditionals, complex noun phrases, embedded clauses
Stage 6	B2–C1	2,500	Near-native grammar range; archaic forms acceptable in classics

The key design principle is no structures above the level. An author writing a Stage 2 text cannot use the past perfect even once. This discipline is harder than vocabulary control because English grammar is recursive; sophisticated meaning often requires complex syntax.

Sentence Length Norms

Empirical studies of graded readers (Claridge, 2005; Grabowski, 2015) yield approximate sentence length norms:

Level	CEFR	Mean sentence length (words)	Max clause depth
Starter	A1	7–10	1 (simple/coordinate)
Stage 1–2	A1–A2	10–13	2 (one subordinate)
Stage 3–4	B1	13–17	2–3
Stage 5–6	B2	16–22	3–4
Authentic adult fiction	N/A	18–25	Unrestricted

Syntactic depth (number of clause embeddings per sentence) is as important as raw sentence length. A 20-word sentence with flat coordination ("She ran and fell and cried and looked up") is more accessible than a 12-word sentence with heavy embedding ("The man she'd once trusted had gone").

Readability Measures

Formula-Based Measures and Their Limits

Flesch Reading Ease (Flesch, 1948): Based on average sentence length (ASL) and average syllables per word (ASW). Score 0–100; higher = easier. Designed for native English readers. Not calibrated to CEFR. A Flesch score of 60–70 corresponds roughly to plain English for adult native readers; EFL texts typically target higher scores (70–80+) even at B2.

Flesch-Kincaid Grade Level: Converts the above to US school grade. Grade 5 ≈ 10-year-old. Again, calibrated for L1 readers; should be used cautiously for L2 material. A graded reader at 400 headwords will often score Grade 3–5 FK, but this does not mean it is appropriate for 8-year-olds.

Lexile Framework: Used primarily in US education. A Lexile score is derived from word frequency and sentence length. Lexile 500–700 = roughly A2–B1; 1000+ = C1. Lexile is increasingly used by ELT publishers but was designed for L1 comprehension, so it systematically underestimates difficulty for L2 readers who lack background vocabulary even at "easy" Lexile levels.

CEFR-J Readability Index (CEFR-J Rater): Specifically designed for EFL contexts. Developed at Tokyo University of Foreign Studies for Japanese learners but applicable cross-linguistically. It integrates lexical frequency bands, syntactic complexity measures, and text length. More aligned with L2 reading behaviour than Flesch or Lexile.

Limitations of all formula measures: Readability formulas are proxy measures. They do not directly assess:

Background knowledge requirements
Discourse coherence complexity
Cultural load
Pragmatic inference demands

For graded reader QA, formula measures are used as a screen, not a verdict. A text that passes a formula check still needs expert editorial review.

The "Plus-One" / i+1 Principle in Reader Design

Krashen's Input Hypothesis (1982) states acquisition occurs when learners encounter input at i+1, one level above current competence. Graded readers operationalise this by:

Ensuring 95–98% of the text is composed of known words (the "i" base)
Embedding approximately 2–5% new vocabulary (the "+1" layer) in high-support contexts
Structuring grammar slightly above the learner's current production level but still comprehensible

The practical implication: a good graded reader at Bookworms Stage 2 should be genuinely comfortable for someone who just completed Stage 1, not someone who aspires to Stage 2. The unknown words should feel like discoveries, not obstacles.

Publisher Design Philosophies Compared

Oxford Bookworms Library

Strictest grammar syllabus adherence of the major series (Hill, 1997 ELT Journal review)
Includes activities, comprehension questions, and cultural notes
Headword counts publicly documented in detail
Both simplified classics and original stories
Most widely used in IELTS/EAP preparation contexts due to authentic topic range

Penguin Readers (Pearson)

More market-oriented: heavy use of film tie-ins, celebrity biographies, contemporary culture
Asian market editions include more exercises and glossaries
Now CEFR-mapped but with less rigorous grammar syllabus documentation
Broader level range (Pre-A1 starter exists); vocabulary definitions looser

Cambridge English Readers

Original fiction only; no simplified classics
Deliberately exclude support materials: no glossary, no activities. Texts look like "real" books
Online support available separately
Most rigorous treatment as adult leisure reading
Headword range extends to 3,800 (most advanced of the major series)
Design philosophy closest to authentic text; some argue this makes them less pedagogically scaffolded

Macmillan Readers

Mid-range positioning between Oxford's rigour and Penguin's commercial orientation
Strong non-fiction and factual reader strand
Clear CEFR labelling but with acknowledged level inconsistencies vs. Oxford (Claridge, 2012)

References

Day, R.R. & Bamford, J. (1998). Extensive Reading in the Second Language Classroom. Cambridge University Press.
Nation, I.S.P. (2001). Learning Vocabulary in Another Language. Cambridge University Press.
Nation, I.S.P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63(1), 59–82.
Nation, I.S.P. (2015). Principles guiding vocabulary learning through extensive reading. Reading in a Foreign Language, 27(1), 136–145.
Hu, M. & Nation, I.S.P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403–430.
Waring, R. & Takaki, M. (2003). At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language, 15(2), 130–163.
Claridge, G. (2005). Simplification in graded readers: Measuring the authenticity of graded texts. Reading in a Foreign Language, 17(2), 144–158.
Claridge, G. (2012). Graded readers: How the publishers make the grade. Reading in a Foreign Language, 24(1), 106–119.
Krashen, S. (1982). Principles and Practice in Second Language Acquisition. Pergamon Press.

Related Terms

Graded Reader Construction

MethodologyLanguage Analysis

How publishers build graded readers: the craft decisions behind vocabulary control, grammar grading, readability measurement, and the treatment of unknown words.

The Core Vocabulary System

Headwords vs. Lemmas vs. Word Families

These three concepts are frequently conflated but operate differently in reader design.

Token: any running word in the text. "run", "runs", "running", "ran" = 4 tokens.

Lemma: the base form + its inflections only. "run" covers runs/running/ran = 1 lemma, 4 tokens.

Headword Lists by Publisher

Publisher	Series	Levels	Headword Range	CEFR Approx
Oxford	Oxford Bookworms Library	Starter–Stage 6	250–2,500	A1–B2
Penguin/Pearson	Penguin Readers	Starter–Level 7	200–3,000	Pre-A1–C1
Cambridge	Cambridge English Readers	Starter–Level 6	250–3,800	A1–C1
Macmillan	Macmillan Readers	Starter–Upper	300–2,200	A1–B2
National Geographic	Footprint Reading Library	8 levels	800–3,000	A2–B2

The NGSL-GR: A Modern Alternative

Vocabulary Control Techniques

The 98% Coverage Threshold

Implications:

A 400-headword text that introduces 50 new words must recycle those words heavily so that any given page maintains 98% known coverage
Coverage ≠ comprehension; a learner may know 98% of words but fail to comprehend due to syntactic complexity or background knowledge gaps
The 98% figure applies to word families, not raw tokens; knowing "run" gives coverage for "runner" at most proficiency levels

Coverage thresholds for vocabulary size at 98%:

Text type	Vocabulary needed (word families)
Graded readers (controlled)	300–2,500 depending on level
General fiction	~8,000–9,000 (Nation, 2006)
Newspapers	~9,000–10,000
Academic text	~8,000 + AWL

Introducing, Recycling, and Glossing New Words

Glossing: Three mechanisms:

Marginal/interlinear gloss: brief L1 or L2 definition in the margin. Research (Hulstijn, Hollander & Greidanus, 1996) shows glosses increase noticing and short-term retention but may reduce reading flow.
End-of-chapter glossary: lower interference with reading flow; learners must actively retrieve.
Running footnote: used by Oxford Bookworms for cultural and proper-noun items outside the headword list.

Handling Proper Nouns, Names, and Cultural References

Not counted: Most publishers do not count proper nouns in the headword total. "London", "Maria", "Toyota" are treated as transparent additions.
Glossed: Culturally opaque references (e.g., "the National Health Service" in a British story) receive footnotes or brief in-text definition.
Adapted: In simplified versions of classics, culturally embedded references may be rewritten or modernised. Oxford Bookworms Guidelines require that cultural references be explained or replaced if they are likely to be opaque to international audiences.
Invented names: Original graded reader authors often choose phonologically simple, internationally recognisable names (Anna, Tom, Carlos) to reduce decoding load for diverse learners.

Grammar Grading

Oxford Bookworms Grammar Progression

Stage	CEFR	Headwords	Key Grammar
Starter	A1	250	Present simple, past simple, basic modals (can/can't), imperatives, simple coordination
Stage 1	A1–A2	400	Past simple, coordination (and/but/or), subordination (before, after, when, because, so)
Stage 2	A2–B1	700	Present perfect, will (future), have to/must/could, comparative adjectives, simple if-clauses, past continuous, tag questions, ask/tell + infinitive
Stage 3	B1	1,000	Should/may, present perfect continuous, used to, past perfect, causative, relative clauses
Stage 4	B1–B2	1,400	Future perfect/continuous, past modals (might have, should have), more complex passives, reported speech
Stage 5	B2	1,800	Full range of conditionals, complex noun phrases, embedded clauses
Stage 6	B2–C1	2,500	Near-native grammar range; archaic forms acceptable in classics

Sentence Length Norms

Empirical studies of graded readers (Claridge, 2005; Grabowski, 2015) yield approximate sentence length norms:

Level	CEFR	Mean sentence length (words)	Max clause depth
Starter	A1	7–10	1 (simple/coordinate)
Stage 1–2	A1–A2	10–13	2 (one subordinate)
Stage 3–4	B1	13–17	2–3
Stage 5–6	B2	16–22	3–4
Authentic adult fiction	N/A	18–25	Unrestricted

Readability Measures

Formula-Based Measures and Their Limits

Limitations of all formula measures: Readability formulas are proxy measures. They do not directly assess:

Background knowledge requirements
Discourse coherence complexity
Cultural load
Pragmatic inference demands

For graded reader QA, formula measures are used as a screen, not a verdict. A text that passes a formula check still needs expert editorial review.

The "Plus-One" / i+1 Principle in Reader Design

Krashen's Input Hypothesis (1982) states acquisition occurs when learners encounter input at i+1, one level above current competence. Graded readers operationalise this by:

Ensuring 95–98% of the text is composed of known words (the "i" base)
Embedding approximately 2–5% new vocabulary (the "+1" layer) in high-support contexts
Structuring grammar slightly above the learner's current production level but still comprehensible

Publisher Design Philosophies Compared

Oxford Bookworms Library

Strictest grammar syllabus adherence of the major series (Hill, 1997 ELT Journal review)
Includes activities, comprehension questions, and cultural notes
Headword counts publicly documented in detail
Both simplified classics and original stories
Most widely used in IELTS/EAP preparation contexts due to authentic topic range

Penguin Readers (Pearson)

More market-oriented: heavy use of film tie-ins, celebrity biographies, contemporary culture
Asian market editions include more exercises and glossaries
Now CEFR-mapped but with less rigorous grammar syllabus documentation
Broader level range (Pre-A1 starter exists); vocabulary definitions looser

Cambridge English Readers

Original fiction only; no simplified classics
Deliberately exclude support materials: no glossary, no activities. Texts look like "real" books
Online support available separately
Most rigorous treatment as adult leisure reading
Headword range extends to 3,800 (most advanced of the major series)
Design philosophy closest to authentic text; some argue this makes them less pedagogically scaffolded

Macmillan Readers

Mid-range positioning between Oxford's rigour and Penguin's commercial orientation
Strong non-fiction and factual reader strand
Clear CEFR labelling but with acknowledged level inconsistencies vs. Oxford (Claridge, 2012)

References

Day, R.R. & Bamford, J. (1998). Extensive Reading in the Second Language Classroom. Cambridge University Press.
Nation, I.S.P. (2001). Learning Vocabulary in Another Language. Cambridge University Press.
Nation, I.S.P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63(1), 59–82.
Nation, I.S.P. (2015). Principles guiding vocabulary learning through extensive reading. Reading in a Foreign Language, 27(1), 136–145.
Hu, M. & Nation, I.S.P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403–430.
Waring, R. & Takaki, M. (2003). At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language, 15(2), 130–163.
Claridge, G. (2005). Simplification in graded readers: Measuring the authenticity of graded texts. Reading in a Foreign Language, 17(2), 144–158.
Claridge, G. (2012). Graded readers: How the publishers make the grade. Reading in a Foreign Language, 24(1), 106–119.
Krashen, S. (1982). Principles and Practice in Second Language Acquisition. Pergamon Press.

Graded Reader Construction

The Core Vocabulary System

Headwords vs. Lemmas vs. Word Families

Headword Lists by Publisher

The NGSL-GR: A Modern Alternative

Vocabulary Control Techniques

The 98% Coverage Threshold

Introducing, Recycling, and Glossing New Words

Handling Proper Nouns, Names, and Cultural References

Grammar Grading

Oxford Bookworms Grammar Progression

Sentence Length Norms

Readability Measures

Formula-Based Measures and Their Limits

The "Plus-One" / i+1 Principle in Reader Design

Publisher Design Philosophies Compared

Oxford Bookworms Library

Penguin Readers (Pearson)

Cambridge English Readers

Macmillan Readers

References

See Also

Related Terms

Graded Reader Construction

The Core Vocabulary System

Headwords vs. Lemmas vs. Word Families

Headword Lists by Publisher

The NGSL-GR: A Modern Alternative

Vocabulary Control Techniques

The 98% Coverage Threshold

Introducing, Recycling, and Glossing New Words

Handling Proper Nouns, Names, and Cultural References

Grammar Grading

Oxford Bookworms Grammar Progression

Sentence Length Norms

Readability Measures

Formula-Based Measures and Their Limits

The "Plus-One" / i+1 Principle in Reader Design

Publisher Design Philosophies Compared

Oxford Bookworms Library

Penguin Readers (Pearson)

Cambridge English Readers

Macmillan Readers

References

See Also

Related Terms