Hearing by seeing – The McGurk Effect

Humans use vision as well as hearing to understand speech. There is no better example of this than the McGurk effect.

Imagine a recording of a person saying “bah” over and over again. What sound are you going to hear? “Bah”, obviously.

However, if this sound is accompanied by a video of this person saying “vah” you will start hearing “vah” or “fah” instead, even though the audio has not changed. Just have a look at the video:

Because the combination of sound and vision happens so early in the process of speech recognition the illusion persists even after you know how it works. The brain cannot differentiate whether it’s hearing or seeing the sound as usually these senses subtly work together to give us the ability to comprehend what other people are saying.

It’s quite interesting that most people are really quite bad at lip-reading, yet most people are easily fooled by the McGurk effect.

Also quite interesting is that the McGurk effect seems to have a different intensity to people who speak different languages. For example, to Italian, German and English speakers the effect is very strong. To Chinese and Japanese speakers the effect is quite weak. This is likely due to the different syllabic structure and tone of Chinese and Japanese (compared to indo-european languages) and possibly due to the culture of face avoidance in Japan (for example, children are instructed to look at a teacher’s adam’s apple or tie knot instead of the eyes).


