Scripted Listening Text
A scripted listening text is dialogue or monologue written in advance and read aloud by voice actors for coursebook audio. The script is drafted to a target level, edited for pedagogic load, and recorded under studio conditions. Almost every mainstream ELT coursebook from Pre-A1 through B2 relies on scripted audio for its core listening syllabus.
Characteristics
Scripted texts share a recognisable surface profile. Sentences are full and grammatically intact; turns are clean, with one speaker yielding cleanly to the next; information density is low because every utterance is keyed to a teaching point; and the Features of Unplanned Spoken Discourse (false starts, fillers, self-repair, ellipsis, repetition, vague language, backchannels) are largely absent. Voice actors deliver lines with exaggeratedly clear pronunciation, modest pace, and minimal connected-speech reduction. The result reads, when transcribed, much like written prose split across speakers.
Why publishers use scripted audio
Publishers favour scripts for editorial control. A script can be pitched precisely at a CEFR band, seeded with target grammar and lexis, timed to fit a workbook page, and re-recorded if a structure changes between editions. Scripts are reusable assets, adaptable to companion video, exam practice, and digital platforms, and they protect against unpredictable spoken-discourse messiness that would force frequent reshoots. Beginner audio in particular leans on scripts because spontaneous speech at A1–A2 information loads is, in practice, almost impossible to capture authentically.
The critique
Scripted audio is the central target of the authenticity debate in listening materials. Gilmore (2007) reviewed studies comparing coursebook dialogue with corpus data and found systematic gaps: under-representation of vague language, hedges, discourse markers, ellipsis, and high-frequency spoken lexis. Wagner (2014a, 2014b) ran the empirical comparison directly. Learners trained on scripted texts scored higher on tests built from scripted texts, but learners trained on unscripted recordings transferred more reliably to genuine listening. The diagnosis is consistent across Brian Tomlinson, John Field, and Gary Buck: a steady diet of scripted input prepares learners for coursebook listening, not real-world listening.
Field (2008) frames the problem as a mismatch in the listening signal itself. Scripted speech presents a clean phonological surface that lets learners decode lexically without engaging the perceptual repair strategies (handling reductions, weak forms, intonation cues) that real spoken English demands. Buck (2001) makes the parallel testing argument: a listening test built on scripted audio measures comprehension of a register that does not exist outside ELT.
Counter-considerations
Scripted texts retain a defensible role at lower levels and for targeted skill work. They expose target structures cleanly, support intensive Bottom-Up Listening Repair practice on identified features, and give beginners a manageable entry point. The materials-development position now widely endorsed is a phased blend: scripted audio early, Semi-scripted Listening Text in the transitional band, Unscripted Listening Text from B1 upwards.
References
- Buck, G. (2001). Assessing Listening. Cambridge University Press.
- Field, J. (2008). Listening in the Language Classroom. Cambridge University Press.
- Gilmore, A. (2007). Authentic materials and authenticity in foreign language learning. Language Teaching, 40(2), 97–118. https://doi.org/10.1017/S0261444807004144
- Wagner, E. (2014a). Using unscripted spoken texts in the teaching of second language listening. TESOL Journal, 5(2), 288–311. https://doi.org/10.1002/tesj.120
- Wagner, E., & Toth, P. D. (2014b). Teaching and testing L2 Spanish listening using scripted vs. unscripted texts. Foreign Language Annals, 47(3), 404–422. https://doi.org/10.1111/flan.12091