ISCA Archive Interspeech 2019
ISCA Archive Interspeech 2019

Unified Verbalization for Speech Recognition & Synthesis Across Languages

Sandy Ritchie, Richard Sproat, Kyle Gorman, Daan van Esch, Christian Schallhart, Nikos Bampounis, Benoît Brard, Jonas Fromseier Mortensen, Millie Holt, Eoin Mahon

We describe a new approach to converting written tokens to their spoken form, which can be shared by automatic speech recognition (ASR) and text-to-speech synthesis (TTS) systems. Both ASR and TTS need to map from the written to the spoken domain, and we present an approach that enables us to share verbalization grammars between the two systems while exploiting linguistic commonalities to provide simple default verbalizations. We also describe improvements to an induction system for number names grammars. Between these shared ASR/TTS verbalizers and the improved induction system for number names grammars, we achieve significant gains in development time and scalability across languages.