ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Acoustics-based baseform generation with pronunciation and/or phonotactic models

Bhuvana Ramabhadran, Sabine Deligne, Abraham Ittycheriah

In this paper, we describe a method to derive a phonetic pronunciation of a word using only an acoustic utterance of that word without a priori knowledge of the spelling of the word. In [5] and [6], we used a pronunciation model based on bigram statistics. Bi-gram statistics only constrain the left neighbor phone and results in phone sequences that are only pairwise appropriate. Here, we apply a pronunciation model in combination with a phonotactic model that serves the purpose of a language model to constrain the phone sequences produced. Error rates with and without the phonotactic model are presented.