ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Chinese language model adaptation based on document classification and multiple domain-specific language models

Sung-Chien Lin, Chi-Lung Tsai, Lee-Feng Chien, Ker-Jiann Chen, Lin-Shan Lee

Adaptation of language models to the specific subject domains is definitely important for real speech recognition applications. In this paper, a Chinese language model adaptation approach is presented mainly based on document classification and multiple domain- specific language models. The proposed document classification method using the perplexity value and word bigram coverage value as primary measures are able to model word associations and syntactic behavior in classifying documents into the clusters and thus creates more effective domain-specific language models. The adaptation of language model in speech recognition can be therefore effectively achieved by the proper selection of the most appropriated domain-specific language model. Preliminary tests have been made in application to Mandarin speech recognition and shown its exciting performance of the proposed approach in creating real applications.


doi: 10.21437/Eurospeech.1997-424

Cite as: Lin, S.-C., Tsai, C.-L., Chien, L.-F., Chen, K.-J., Lee, L.-S. (1997) Chinese language model adaptation based on document classification and multiple domain-specific language models. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 1463-1466, doi: 10.21437/Eurospeech.1997-424

@inproceedings{lin97c_eurospeech,
  author={Sung-Chien Lin and Chi-Lung Tsai and Lee-Feng Chien and Ker-Jiann Chen and Lin-Shan Lee},
  title={{Chinese language model adaptation based on document classification and multiple domain-specific language models}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={1463--1466},
  doi={10.21437/Eurospeech.1997-424},
  issn={1018-4074}
}