Item Bank
An item bank is a structured store of test items along with the metadata required to assemble valid test forms from them: item statistics, calibration parameters, content tags, format type, exposure history, and provenance. A bank is not just a list of questions; it is a managed asset whose value comes from what is recorded about each item.
The discipline emerged from psychometrics in the 1970s alongside Item Response Theory (IRT) and computer-adaptive testing, both of which depend on stable, sample-independent item parameters that only a properly maintained bank can provide. Cambridge English, ETS, the IELTS partnership, and most national high-stakes testing programmes now run on item banks.
What a properly tagged item carries
Operational item banks tag each item with most or all of the following:
- Construct tags: which sub-skill, grammar leaf, vocabulary tier, or reading sub-skill the item targets.
- Format tags: multiple choice, gap fill, matching, short answer, essay.
- Calibration data: p-value or IRT b-parameter, discrimination index or a-parameter, distractor analysis, sample size, administration date.
- Provenance: author, source passage, review history, edit log.
- Exposure data: how often the item has appeared in live forms, how recently, and to which cohorts. Drives item retirement.
- Quality flags: items under review, items withdrawn, items needing recalibration.
Without these fields a bank is a corpus of questions, not a testing asset.
Why banks matter for fair, comparable testing
Three properties depend on a bank: parallel-form equivalence, longitudinal score comparability, and adaptive testing.
Parallel forms must cover the same construct at the same difficulty. This is achievable only when the bank carries enough calibrated items per cell of the test specification to assemble each form to spec. Longitudinal comparability — saying that a Band 7 in 2019 means the same as a Band 7 in 2026 — is what calibration data plus secure anchor items deliver. Adaptive testing is impossible without a deeply calibrated bank because the algorithm needs known item parameters to choose the next item conditioned on the candidate's running ability estimate.
Implications for AI-assisted item generation
The bottleneck in AI-assisted item development is not generation; it is calibration and tagging. A generator can produce thousands of plausible items overnight, but each item entering the bank needs construct tags, format tags, and either a pre-test calibration or a confidence-weighted prior on its parameters. Pipelines that skip these steps produce a corpus, not a bank, and will not support the test forms the developer wants to assemble.
Key References
- Bachman, L. F. & Palmer, A. S. (1996). Language Testing in Practice. Oxford University Press.
- Wright, B. D. & Bell, S. R. (1984). Item banks: What, why, how. Journal of Educational Measurement, 21(4), 331–345.
- Vale, C. D. (2006). Computerised item banking. In S. M. Downing & T. M. Haladyna (eds.), Handbook of Test Development. Lawrence Erlbaum.
- Alderson, J. C., Clapham, C. & Wall, D. (1995). Language Test Construction and Evaluation. Cambridge University Press.
See Also
- Item Analysis: the source of the calibration data the bank stores
- Test Specifications: the spec a bank is sized and tagged to serve
- Adaptive Testing: the deployment that makes a calibrated bank essential
- Reading Comprehension Test Design: where the bank's tagging discipline is felt