ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Multiscale recurrent neural network based language model

Tsuyoshi Morioka, Tomoharu Iwata, Takaaki Hori, Tetsunori Kobayashi

We describe a novel recurrent neural network-based language model (RNNLM) dealing with multiple time-scales of contexts. The RNNLM is now a technical standard in language modeling because it remembers some lengths of contexts. However, the RNNLM can only deal with a single time-scale of a context, regardless of the subsequent words and topic of the spoken utterance, even though the optimal time-scale of the context can vary under such conditions. In contrast, our multiscale RNNLM enables incorporating with sufficient flexibility, and it makes use of various time-scales of contexts simultaneously and with proper weights for predicting the next word. Experimental comparisons carried out in large vocabulary spontaneous speech recognition demonstrate that introducing the multiple time-scales of contexts into the RNNLM yielded improvements over existing RNNLMs in terms of the perplexity and word error rate.