ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

The intelligibility of lombard speech: communicative setting matters

Michael Fitzpatrick, Jeesun Kim, Chris Davis

Recently we reported that talkers modify their speech production strategies in noise as a function of whether their interlocutor could or could not be seen, i.e. face-to-face (FTF) or non-visual conditions (NV). Participants made greater auditory speech modifications (e.g. in terms of amplitude and F0) in NV condition, and greater visual speech modifications (in terms of inter-lip area) in FTF condition. The current study examined whether such modifications will lead to corresponding differences in speech intelligibility in the different settings. In the experiment, participants were presented with a set of consonant-vowel-consonant (CVC) phonemes in noise at a fixed SNR in auditory-only, visual-only and auditory-visual conditions. The CVC stimuli were drawn from speech recordings in quiet and in noise conditions, and also during NV and FTF conditions. The results showed that the speech in noise tokens produced in the FTF conditions had a greater AV benefit than for tokens produced in the NV conditions. Also, the AV benefit was greater for speech tokens produced in noise than for speech produced in quiet. The results were discussed in terms of efficient talker and listener strategies.

Index Terms: Lombard speech, AV speech, speech production.