vocd-D

AssessmentLanguage AnalysisvocdVOCDD measureMalvern-Richards D

A measure of Lexical Diversity developed theoretically by David Malvern and Brian Richards in the late 1990s and given a computational implementation by Gerald McKee, Malvern, and Richards in 2000. The output parameter is conventionally written D, and the tool name vocd (now vocd-D to distinguish it from the implementation). One of the two diversity indices Text Inspector reports by default, alongside MTLD.

How it works

vocd-D treats the decay of TTR across text length as a curve to be fit, not a defect to be repaired. It samples the text repeatedly at increasing token counts (canonically 35 to 50 tokens, 100 random samples per length), computes the average TTR at each length, and fits a theoretical curve to the resulting points. The curve has a single free parameter, D; the value that gives the best fit is reported as the lexical-diversity score.

The theoretical curve is derived from the hypergeometric distribution — the probability of drawing each word type at least once in a random sample of n tokens, summed across all types in the text. D is the value that aligns the empirical TTR-vs-length curve with the hypergeometric expectation. Higher D means the empirical curve falls more slowly than the comparison sample, which means the text introduces new types more readily as it grows.

Typical ranges

vocd-D scores cluster between roughly 30 and 120 for academic and journalistic prose. Native-speaker writing typically scores 80–110; learner writing at B1–B2 typically scores 50–80; very simple controlled language sits below 50. The numbers are not directly comparable to MTLD (the scales are unrelated) but the rankings of texts on the two measures correlate strongly.

What McCarthy and Jarvis found

The 2010 validation study identified two issues with vocd-D in its standard implementation. First, because the score depends on a random-sampling routine, repeated runs on the same text give different scores; this stochastic noise is small in long texts and large in short ones. Second, vocd-D drifts on texts much shorter than 200 tokens, where the 35-to-50 sampling range starts to dominate the score. They proposed HD-D as a more direct, deterministic implementation of the same underlying construct, computing the hypergeometric probabilities analytically rather than by random sampling. McCarthy and Jarvis recommend reporting HD-D in research contexts where reproducibility matters and reserving vocd-D for legacy compatibility with the large CHILDES-era literature that uses it.

Why both vocd-D and HD-D are still in tools

vocd-D predates HD-D by a decade and is embedded in CLAN (the CHILDES analysis suite), so a substantial child-language and aphasia-research literature reports vocd-D scores. Tools that compute lexical diversity for research consumption keep vocd-D for backward comparison even after HD-D supersedes it methodologically. Text Inspector follows this pattern.

Use in ELT and test design

For IELTS-style essay assessment vocd-D's stochastic noise is a real limitation: a single 250-word essay can produce vocd-D scores that differ by 5–10 points across runs of the same tool. For published research and aggregated learner-corpus work the random noise averages out and vocd-D remains useful. For real-time scoring or single-text feedback, MTLD or HD-D is the better choice.

References

Malvern, D. & Richards, B. (1997). A new measure of lexical diversity. In A. Ryan & A. Wray (Eds.), Evolving Models of Language. Multilingual Matters.
McKee, G., Malvern, D. & Richards, B. (2000). Measuring vocabulary diversity using dedicated software. Literary and Linguistic Computing, 15(3), 323–337.
Malvern, D., Richards, B., Chipere, N. & Durán, P. (2004). Lexical Diversity and Language Development: Quantification and Assessment. Palgrave Macmillan.
McCarthy, P. M. & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24(4), 459–488.
McCarthy, P. M. & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392.

Related Terms

vocd-D

AssessmentLanguage AnalysisvocdVOCDD measureMalvern-Richards D

How it works

Typical ranges

What McCarthy and Jarvis found

Why both vocd-D and HD-D are still in tools

Use in ELT and test design

References

Malvern, D. & Richards, B. (1997). A new measure of lexical diversity. In A. Ryan & A. Wray (Eds.), Evolving Models of Language. Multilingual Matters.
McKee, G., Malvern, D. & Richards, B. (2000). Measuring vocabulary diversity using dedicated software. Literary and Linguistic Computing, 15(3), 323–337.
Malvern, D., Richards, B., Chipere, N. & Durán, P. (2004). Lexical Diversity and Language Development: Quantification and Assessment. Palgrave Macmillan.
McCarthy, P. M. & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24(4), 459–488.
McCarthy, P. M. & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392.

vocd-D

How it works

Typical ranges

What McCarthy and Jarvis found

Why both vocd-D and HD-D are still in tools

Use in ELT and test design

References

See Also

Related Terms

vocd-D

How it works

Typical ranges

What McCarthy and Jarvis found

Why both vocd-D and HD-D are still in tools

Use in ELT and test design

References

See Also

Related Terms