Confidence Interval
A range of values, computed from sample data, that would contain the true population parameter in a stated proportion of repeated samples drawn from the same population. A 95% confidence interval — the conventional default in applied-linguistics reporting — is constructed so that, across many hypothetical replications of the study with the same procedure, 95% of the intervals would capture the parameter being estimated.
Interpretation
The interval is a property of the procedure, not of the single sample. A correct reading of "the 95% CI for the mean gain is [2.1, 5.7]" is that the procedure used to build the interval has a 95% long-run capture rate. It is not correct to say there is a 95% probability that the parameter lies inside this particular interval — the parameter is fixed and the interval is the random object. Interval width depends on sample variability (Standard Deviation), sample size, and the chosen confidence level.
Why Preferred Over p-value Alone
A confidence interval carries the information of a null-hypothesis test — if a null value (typically zero difference) lies outside the 95% CI, the corresponding two-sided test rejects at α = .05 — but adds magnitude and precision. Wide intervals signal that the data permit many parameter values, including substantively different ones; narrow intervals around a small effect tell a different story than narrow intervals around a large effect. The 2016 ASA statement and reviews of L2 reporting practice both push for routine CI reporting alongside Effect Size estimates.
In SLA Reporting
Plonsky and Oswald (2014) recommend reporting confidence intervals around effect sizes such as Cohen's d and r, both for primary studies and as inputs to meta-analysis. CIs around effect sizes make it visible when an apparently "significant" finding rests on imprecise estimation, a common situation in small classroom-based quasi-experiments.
References
- Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7–29.
- Larson-Hall, J. (2016). A Guide to Doing Statistics in Second Language Research Using SPSS and R (2nd ed.). New York: Routledge.
- Plonsky, L., & Oswald, F. L. (2014). How big is "big"? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912.
- Wasserstein, R. L., & Lazar, N. A. (2016). The ASA's statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133.