Cohen's d
A standardised effect size for the difference between two means, expressed in pooled standard-deviation units: d = (M₁ − M₂) / SD_pooled. A d of 0.50 means the group means differ by half a standard deviation. Because the metric is scale-free, d values can be compared across studies and aggregated in meta-analysis.
Formula and Variants
Cohen (1988) defined d using the pooled within-group standard deviation. Variants in current use differ in their denominator: Glass's Δ uses the control-group SD (appropriate when the treatment may distort variance); the version using the pooled SD without bias correction is Cohen's d; the small-sample bias-corrected form is Hedges' g. Most modern software, including the effectsize R package and SPSS extensions, reports d together with a Confidence Interval.
Cohen's General Benchmarks
Cohen (1988) suggested rough thresholds for the behavioural sciences: 0.20 small, 0.50 medium, 0.80 large. These were proposed as conventions for when no field-specific norms exist, not as universal rules.
L2-Specific Benchmarks
Plonsky and Oswald (2014) re-derived field benchmarks empirically from a large corpus of published L2 studies, distinguishing between- and within-group designs. For between-group contrasts they propose 0.40 small, 0.70 medium, 1.00 large; for within-group contrasts, 0.60, 1.00, and 1.40. The L2 benchmarks are higher than Cohen's because typical SLA studies report inflated effects relative to behavioural-science baselines, often due to small samples and selective reporting.
Use and Cautions
A d value should always be reported with its Confidence Interval and the descriptive statistics it was computed from. In small samples — characteristic of classroom quasi-experiments — d is upwardly biased and Hedges' g is preferred. The metric assumes roughly equal variances; under heteroscedasticity, Glass's Δ is the safer choice.
References
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
- Plonsky, L., & Oswald, F. L. (2014). How big is "big"? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912.
- Larson-Hall, J. (2016). A Guide to Doing Statistics in Second Language Research Using SPSS and R (2nd ed.). New York: Routledge.