Stratified Sampling
A probability-sampling technique that divides the target population into mutually exclusive subgroups (strata) defined by characteristics relevant to the research question, then samples within each stratum. Strata in language research typically correspond to proficiency band, year level, L1 background, programme type, or institution. Stratification guarantees that every subgroup is represented in the final sample, reducing sampling variance for estimates that depend on those characteristics.
Proportional vs Disproportional
In proportional stratified sampling, the number drawn from each stratum is proportional to its share of the population — a population that is 60% female and 40% male yields a sample with the same split. This preserves overall population estimates with minimum bias. In disproportional (also called optimum or oversampled) stratified sampling, smaller or higher-variance strata are deliberately oversampled to obtain stable subgroup estimates; weights are then applied at analysis to recover unbiased population-level statistics. Test-development research often oversamples low-proficiency strata because that is where item-difficulty estimates are most uncertain.
Use in Language Testing
Large-scale language-test calibration studies — Cambridge English, IELTS, TOEFL — routinely use stratified sampling to ensure adequate representation across proficiency levels, L1 groups, and test centres. National L2 assessment programmes stratify by region, school type, and grade so that subgroup norms are estimable. In SLA outside large-scale testing, stratification is most often applied within an already-existing convenience frame: a researcher with access to multiple intact classes may stratify by proficiency band before randomly assigning conditions.
Conditions and Trade-offs
Stratification requires that strata be definable in advance and that membership be known for every population unit. When the stratifying variable is well chosen — strongly associated with the outcome — stratified estimates have lower variance than simple Random Sampling. When the stratifying variable is weakly related to the outcome, stratification adds administrative cost without precision gain. Strata that are too small for stable within-stratum estimates may need to be collapsed.
References
- Dörnyei, Z. (2007). Research Methods in Applied Linguistics: Quantitative, Qualitative, and Mixed Methodologies. Oxford: Oxford University Press.
- Mackey, A., & Gass, S. M. (2016). Second Language Research: Methodology and Design (2nd ed.). New York: Routledge.
- Cohen, L., Manion, L., & Morrison, K. (2018). Research Methods in Education (8th ed.). London: Routledge.