Connected Speech

Connected speech refers to the way pronunciation changes when words are produced in natural, fluent sequences rather than spoken in isolation. A word's dictionary pronunciation — its citation form — is often dramatically different from how it sounds inside a phrase. "Want to" becomes /wɒnə/, "him" becomes /ɪm/, "next please" loses its /t/. These are not sloppy speech; they are systematic, rule-governed processes that all fluent speakers use.

Why It Matters

Connected speech is the single biggest reason learners struggle with listening comprehension. They learn words in isolation, then cannot recognize those same words in a stream of natural speech. The mismatch between what they expect to hear and what actually reaches their ears causes comprehension breakdown — not because the vocabulary is unknown, but because the phonological shape has changed.

For production, connected speech is what makes a speaker sound fluent rather than robotic. A learner who says each word separately with equal stress and full vowels sounds unnatural and is harder for native speakers to process, because the listener's brain expects the rhythmic compression that connected speech provides.

Core Processes

Assimilation — A sound changes to become more like a neighboring sound. "Ten boys" → /tem bɔɪz/; "don't you" → /dəʊntʃuː/. The mouth anticipates the next sound and adjusts early.

Elision — A sound is deleted entirely. "Last night" → /lɑːs naɪt/ (loss of /t/); "comfortable" → /kʌmftəbl/ (syllable reduction). Particularly common with consonant clusters and unstressed syllables.

Linking — Words are joined without pause. Consonant-to-vowel linking ("turn off" → /tɜːr_nɒf/), intrusive /r/ ("law and order" → /lɔːr_ənd_ɔːdə/), and glide insertion ("go out" → /gəʊ_w_aʊt/).

Weak forms — Function words (articles, prepositions, auxiliaries, pronouns) reduce to shorter, schwa-dominated pronunciations in unstressed positions. "Can" → /kən/, "for" → /fə/, "was" → /wəz/. English has over 40 common weak forms. They are the default — strong forms occur only when a function word carries contrastive stress or stands at the end of a phrase.

Contractions — Grammaticalized reductions: "I will" → "I'll" /aɪl/, "cannot" → "can't" /kɑːnt/. Contractions are the written representation of what connected speech does naturally.

Teaching Connected Speech

Receptive work should come first. Learners need to be able to recognize connected forms in listening before they are asked to produce them. Effective techniques:

Dictation and dictogloss — expose learners to natural-speed speech and ask them to reconstruct it, noticing what they missed
Transcript comparison — listen, then compare what they heard with the written transcript, marking where connected speech features occur
Shadowing Technique for [[Fluency Development|Shadowing]] — mimicking natural speech in real time forces adoption of connected features

For production, focus on high-frequency chunks where connected speech is most consistent: "want to" /wɒnə/, "going to" /gʌnə/, "have to" /hæftə/. These are better taught as fixed pronunciation patterns than as grammar + pronunciation separately.

Connected speech features are driven by Rhythm — the stress-timed nature of English compresses unstressed syllables, which triggers weak forms, elision, and assimilation. Teaching these features in isolation from rhythm misses the underlying cause.

Why It Matters

Core Processes

Teaching Connected Speech

Related Terms