ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition

Michael Finke, Alex Waibel

In spontaneous conversational speech there is a large amount of variability due to accents, speaking styles and speaking rates (also known as the speaking mode) [3]. Because current recognition systems usually use only a relatively small number of pronunciation variants for the words in their dictionaries, the amount of variability that can be modeled is limited. Increasing the number of variants per dictionary entry is the obvious solution. Unfortunately, this also means increasing the confusability between the dictionary entries, and thus often leads to an actual performance decrease. In this paper we present a framework for speaking mode dependent pronunciation modeling. The probability of encountering pronunciation variants is defined to be a function of the speaking style. The probability function is learned through decision trees from rule based generated pronunciation variants as observed on the Switchboard corpus. The framework is successfully applied to increase the performance of our state-of-the-art Janus Recognition Toolkit Switchboard recognizer significantly.


doi: 10.21437/Eurospeech.1997-625

Cite as: Finke, M., Waibel, A. (1997) Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2379-2382, doi: 10.21437/Eurospeech.1997-625

@inproceedings{finke97_eurospeech,
  author={Michael Finke and Alex Waibel},
  title={{Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={2379--2382},
  doi={10.21437/Eurospeech.1997-625},
  issn={1018-4074}
}