ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

A novel language model based on self-organized learning

Taiyi Huang, Langzhou Chen

Statistical language model is very important to speech recognition. To a system of special topic, domain dependent language model is much better than general model. There are two problems in traditional method to train topic dependent model: 1. The corpus of special topic is not as enough as general corpus. 2. An individual article always relates to more than one topics, traditional method has not considered this phenomena. This paper try to solve these two problems. We have present a new method to organize the corpus--the method based on fuzzy training subset. And the training of domain dependent models are based on these fuzzy subsets. At the same time, a self organized learning approach is introduced in training process to improve the modelsÂ’ predicting ability. The self organized learning can improve the performance of models evidently.