ISCA Archive AVSP 2003
ISCA Archive AVSP 2003

Electrophysiology of auditory-visual speech integration

Virginie van Wassenhove, Ken W. Grant, David Poeppel

Abstract

Twenty-six native English speakers identified auditory (A), visual (V), and congruent and incongruent auditory-visual (AV) syllables while undergoing electroencephalography (EEC) in three experiments. In Experiment 1, unimodal (A, V) and bimodal (AV) stimuli were presented in separate blocks. In Experiment 2 the same stimuli were pseudo-randomized in the same blocks, providing a replication of Experiment 1 while testing the effect of participants' expectancy on the AV condition. In Experiment 3, NcGurk fusion (audio /pa/ dubbed onto visual /ka/, eliciting the percept /ta/) and combination (audio /ka/ dubbed onto visual /pa/) stimuli were tested under visnal attention.

EEG recordings show early effects of visual influence on auditory evoked-related potentials (P1/N1/P2 complex). Specifically, a robust amplitude reduction of the Nl/P2 complex was observed (Experiments 1 and 2) that could not be solely accounted for by attentional effects (Experiment 3) The N1/P2 reduction was accompanied by a temporal facilitation (approximting ~2O ms) of the P1/N1 and N1/P2 transitions in AV conditions. Additionally, incongruient syllables showed a different profile from congruent AV /ta/ over a large latency range (~5O to 350 ms post-auditory onset), which was influenced by the accuracy of identification of the visual stinmli presented unimodally.

Our results suggest that (i) auditory processing is modulated early on by visual speech inputs, in agreement with an early locus of AV speech interaction, (ii) natural precedence of visual kinematics facilitates auditory speech processing in the time domain, and (iii) the degree of temporal gain is a function of the saliency of visual speech inputs.