ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Audio-visual prosody of social attitudes in vietnamese: building and evaluating a tones balanced corpus

Dang-Khoa Mac, Véronique Aubergé, Albert Rilliard, Eric Castelli

This paper presents the building and a first evaluation of a tones balanced Audio-Visual corpus of social affect in Vietnamese language. This under-resourced tonal language has specific glottalization and co-articulation phenomena, for which interactions with attitudes prosody are a very interesting issue. A well-controlled recording methodology was designed to build a large representative audio-visual corpus for 16 attitudes, and one speaker. A perception experiment was carried out to evaluate a speaker’s perceived performances and to study the role and integration of the audio, visual, and audio-visual information in the listener’s perception of the speaker’s attitudes. The results reveal characteristics of Vietnamese prosodic attitudes and allow us to investigate such social affect in Vietnamese language.