ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Backoff hierarchical class n-gram language modelling for automatic speech recognition systems

Imed Zitouni, Olivier Siohan, Hong-Kwang Jeff Kuo, Chin-Hui Lee

In this paper, we propose an extension of the backoff word n-gram language model that allows a better likelihood estimation of unseen events. Instead of using the (n-1)-gram to estimate the probability of an unseen n-gram, the proposed approach uses a class hierarchy to define a context which is more general than the unseen n-gram but more specific than the (n-1)-gram. Each node in the hierarchy is a class containing all the words of the descendant nodes (classes). Hence, the closer a node is to the root, the more general the corresponding class is. Performance is evaluated both in terms of test perplexity and word error rate (WER) on a simplified WSJ database. Experiments show an improvement of more than 26% on the unseen events perplexity.