ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Variable-length language modeling integrating global constraints

Shoichi Matsunaga, Shigeki Sagayama

This paper proposes a novel variable-length class- based language model that integrates local and global constraints. In this model, the classes are iteratively recreated by grouping consecutive words and by splitting initial part-of speech (POS) clusters into finer clusters (word-classes). The main characteristic of this modeling is that these operations of grouping and splitting is carried out selectively, taking into account global constraints between noncontiguous words on the basis of a minimum entropy criterion. To capture the global constraints, the model takes into account the sequences of the function words and of the content words, which are expected to respectively represent the syntactic and semantic relationships between words. Experiments showed that the perplexity of the proposed model for the test corpus is lower than that of conventional models and that this model requires a small number of statistical parameters, showing the model's effectiveness.


doi: 10.21437/Eurospeech.1997-686

Cite as: Matsunaga, S., Sagayama, S. (1997) Variable-length language modeling integrating global constraints. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2719-2722, doi: 10.21437/Eurospeech.1997-686

@inproceedings{matsunaga97_eurospeech,
  author={Shoichi Matsunaga and Shigeki Sagayama},
  title={{Variable-length language modeling integrating global constraints}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={2719--2722},
  doi={10.21437/Eurospeech.1997-686},
  issn={1018-4074}
}