ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Feed forward pre-training for recurrent neural network language models

Siva Reddy Gangireddy, Fergus McInnes, Steve Renals

The recurrent neural network language model (RNNLM) has been demonstrated to consistently reduce perplexities and automatic speech recognition (ASR) word error rates across a variety of domains. In this paper we propose a pre-training method for the RNNLM, by sharing the output weights of a feed forward neural network language model (NNLM) with the RNNLM. This is accomplished by first fine-tuning the weights of the NNLM, which are then used to initialise the output weights of an RNNLM with the same number of hidden units. We have carried out text-based experiments on the Penn Treebank Wall Street Journal data, and ASR experiments on the TED talks data used in the International Workshop on Spoken Language Translation (IWSLT) evaluation campaigns. Across the experiments, we observe small improvements in perplexity and ASR word error rate.


doi: 10.21437/Interspeech.2014-561

Cite as: Gangireddy, S.R., McInnes, F., Renals, S. (2014) Feed forward pre-training for recurrent neural network language models. Proc. Interspeech 2014, 2620-2624, doi: 10.21437/Interspeech.2014-561

@inproceedings{gangireddy14_interspeech,
  author={Siva Reddy Gangireddy and Fergus McInnes and Steve Renals},
  title={{Feed forward pre-training for recurrent neural network language models}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={2620--2624},
  doi={10.21437/Interspeech.2014-561},
  issn={2308-457X}
}