ISCA Archive ISCSLP 2008
ISCA Archive ISCSLP 2008

A Three-stage Text Normalization Strategy For Mandarin Text-to-speech Systems

Tao Zhou, Yuan Dong, De-zhi Huang, Wu Liu, Hai-la Wang

Text normalization is an important component in mandarin Text-to-Speech system. This paper develops a taxonomy of Non-Standard Words (NSW’s) based on a Large-scale Chinese corpus and proposes a three-stage text normalization strategy: Finite State Automata (FSA) for initial classification, Maximum Entropy (ME) Classifier & Rules for further classification and General Rules for standard word conversion. The three-stage approach achieves Precision of 96.02% in experiments, 5.21% higher than that of simple rule based approach and 2.21% higher than that of simple machine learning method. Experiments results show that the approach of three-stage disambiguation strategy for text normalization makes considerable improvement, and works well in real TTS system. Index Terms— Text-to-Speech, Text Normalization, Finite State Automata (FSA), Maximum Entropy (ME) Classifier, Standard Word Conversion