ISCA Archive SpeechProsody 2006
ISCA Archive SpeechProsody 2006

F0 characteristics of yes-no question intonation in Arabic and English: disambiguation techniques for use in ASR

Leslie Barrett, Kazue Hata

This paper presents preliminary research into the possibility of using +F0 (fundamental frequency) information to enhance the performance of speech-to-speech translation engines and speech recognition software for Arabic and English. Specifically, we aim to find factors that differentiate yes-no question in both languages from other sentential types. Although previous research using cross-linguistic question data has shown F0 rise to be the main indicator of yes-no questions, the particular F0 characteristics used by listeners as perceptual cues varied. Using comparative language data, the aim of this study was to find reliable question indicators that could be detected by automated means. In an experiment with short sentences read by a native speaker of each language, we examined aspects of F0 contours in the two languages to find reliable recognition thresholds. Results indicate that reliable indicators of yes-no questions do exist for both languages and occur within the sentence-final 50 centiseconds.