ESL Saigon logo

ESL Saigon

Linking in English language

I have mentioned in another article about the mechanical speech machine that produces words as separated units. Listening to such speech, produced by a machine like that, will definitely seem strange even for a learner of English. In real speech, English language speech by native speakers of English, the words are linked together resulting in a natural speech impossible to be realized by any machine and difficult to be reproduced by learners of English.

Linking r

The most common case of linking is “linking r”. In some varieties of English the phoneme “r” can not occur in syllable-final position, but when a word’s spelling suggests a final “r”, and a word beginning with a vowel follows, the usual pronunciation is to pronounce with “r”. This is very often the case for British English but not only.

Here /hɪə(r)/ (pronounced with “r” by some speakers of English and without “r” by some others)
Here are /hɪər ər/ (always the final “r” is pronounced when it is followed by a word that starts with a vowel)

There is another case, a case that I catalog as strange. Many English native speakers use “r” in the same way for linking a word even if its spelling doesn’t require the “r”, as is shown below.

Media event – it should be pronounced /ˈmidiə ɪˈvɛnt/ but it is pronounced /ˈmidiər ɪˈvɛnt/.

What the above example shows is called “intrusive r” and it is widespread among native speakers of English although most teachers of English and phonologists regard this as sub-standard pronunciation.


"Linking" and "intrusive r" are special cases of juncture. "Juncture" refers to the relationship between one sound and the sound (or sounds) that immediately precede and follow it. I find the two words, “my turn”, to be a great example of juncture.

My turn /maɪ tɜrn/

The problem is deciding what the relationship is between /aɪ/ and /t/ since there is no pause between words, or silence to indicate word division so the space left in the transcription is justified.

Learners of English might understand “might earn” /maɪt ɜrn/ instead of “my turn” /maɪ tɜrn/. What is it that makes the difference between /maɪt ɜrn/ and /maɪ tɜrn/ to be perceptible?

The answer is that in the case of “my turn” the /t/ sound is in an initial position and it is aspirated while in the case of “might earn” the /t/ sound is in a final position and it is not aspirated. As a conclusion, we can say that the position of a word boundary has some effect on the realization of the /t/ phoneme.

Of course, the context in which the words occur makes clear where the boundary is but the context is not always clear for learner of English. It is very clear that there is a great difference between the words pronounced in isolation and in the context of connected speech.

Perhaps the most important consequence of what is described in this article is that learners of English must be aware of the problems that they will meet in listening to colloquial, connected speech.

Back to index