ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Pronunciation learning from continuous speech

Ibrahim Badr, Ian McGraw, James Glass

This paper explores the use of continuous speech data to learn stochastic lexicons. Building on previous work in which we augmented graphones with acoustic examples of isolated words, we extend our pronunciation mixture model framework to two domains containing spontaneous speech: a weather information retrieval spoken dialogue system and the academic lectures domain. We find that our learned lexicons out-perform expert, hand-crafted lexicons in each domain.