Standard Error of Measurement

Research MethodologyAssessmentSEMSEm

The standard error of measurement (SEM) estimates the spread of observed scores expected around an examinee's true score on repeated testing. In classical test theory it is the standard deviation of the error component of an observed score and is computed as

SEM = SD × √(1 − r)

where SD is the standard deviation of observed scores in the sample and r is the test's reliability coefficient. The formula makes the trade-off explicit: a perfectly reliable test has zero measurement error, a test with zero reliability has measurement error equal to the entire score spread.

Interpretation

SEM is reported in the test's own score units, which makes it directly useful for confidence-band reporting around individual scores. Under normality, an observed score plus or minus one SEM contains the true score with about 68 percent probability; plus or minus 1.96 SEMs gives a 95 percent band. A vocabulary test with SD = 12 and reliability of .91 yields SEM ≈ 3.6, so a learner scoring 70 has a 95 percent confidence band of roughly 63 to 77.

For high-stakes decisions, the SEM around a cut-score determines classification consistency: examinees within one SEM of a passing line are particularly vulnerable to misclassification, which is why borderline candidates are often re-examined or double-marked.

Modern alternatives

IRT and Rasch analyses provide a conditional standard error that varies along the ability scale rather than assuming a single value across all examinees. Tests are typically most precise — and SEM smallest — where item difficulties cluster around examinee ability, and least precise at the extremes. Test information functions translate this conditional precision into reciprocal-square-root form and inform decisions about where to add items to improve measurement at the cut-score.

References

AERA, APA, & NCME (2014). Standards for Educational and Psychological Testing. American Educational Research Association.
Bachman, L. F. (1990). Fundamental Considerations in Language Testing. Oxford University Press.
Crocker, L., & Algina, J. (1986). Introduction to Classical and Modern Test Theory. Holt, Rinehart and Winston.

Related Terms