ISCA Archive ICSLP 1992
ISCA Archive ICSLP 1992

A lexicon for a text-to-speech system

Leon Gulikers, Rijk Willemse

This paper describes the development of a full-form lexicon, combined with an algorithm for quasi-morphological decomposition aiming at improved grapheme-to-phoneme conversion, word stress assignment, syllabification and word class assignment in a Text-to-Speech system. We will explain the way in which the optimal size of the lexicon was determined. Also, we describe a deterministic algorithm for decomposing words not found in the lexicon in terms of a sequence of lexicon entries and prefixes, suffixes and infixes. The performance of the lexicon+decomposition system is evaluated with a newspaper corpus comprising approximately 100,000 words. It appears that the system handles more than 95% of the regular words in the test corpus correctly. The system will have to be extended with a module that handles proper names.