ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

Less errors with TTS? a dictation experiment with foreign language learners

Thomas Pellegrini, Ângela Costa, Isabel Trancoso

This article reports a contrastive study about the use of Text-To-Speech (TTS) synthesis instead of pre-recorded utterances in a dictation exercise submitted to students of European Portuguese as a second language (PSL). Fourty sentences were extracted from a PSL student book. Twenty of them were synthesized and the other twenty ones directly taken from the pre-recorded audio documents of the book. The learners were asked to orthographically transcribe the audio sentences presented in a random order. It appeared that the synthetic utterances were easier to transcribe than the human ones, with word error rates of 26.6% and 33.9% respectively. This result was somehow surprising since the synthetic voice was not built for learning purposes. Potential explaining factors were the lower speech rate and the less-reduced pronunciation that characterized the TTS voice.

Index Terms: Computer-Assisted Language Learning, speech synthesis, dictation, European Portuguese