ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Parallel deep neural network training for LVCSR tasks using blue gene/Q

Tara N. Sainath, I-hsin Chung, Bhuvana Ramabhadran, Michael Picheny, John Gunnels, Brian Kingsbury, George Saon, Vernon Austel, Upendra Chaudhari

While Deep Neural Networks (DNNs) have achieved tremendous success for LVCSR tasks, training these networks is slow. To date, the most common approach to train DNNs is via stochastic gradient descent (SGD), serially on a single GPU machine. Serial training, coupled with the large number of training parameters and speech data set sizes, makes DNN training very slow for LVCSR tasks. While 2nd order, data-parallel methods have also been explored, these methods are not always faster on CPU clusters due to the large communication cost between processors. In this work, we explore using a specialized hardware/software approach, utilizing a Blue Gene/Q (BG/Q) system, which has thousands of processors and excellent inter-processor communication. We explore using the 2nd order Hessian-free (HF) algorithm for DNN training with BG/Q, for both cross-entropy and sequence training of DNNs. Results on three LVCSR tasks indicate that using HF with BG/Q offers up to an 11x speedup, as well as an improved word error rate (WER), compared to SGD on a GPU.

doi: 10.21437/Interspeech.2014-272

Cite as: Sainath, T.N., Chung, I.-h., Ramabhadran, B., Picheny, M., Gunnels, J., Kingsbury, B., Saon, G., Austel, V., Chaudhari, U. (2014) Parallel deep neural network training for LVCSR tasks using blue gene/Q. Proc. Interspeech 2014, 1048-1052, doi: 10.21437/Interspeech.2014-272

  author={Tara N. Sainath and I-hsin Chung and Bhuvana Ramabhadran and Michael Picheny and John Gunnels and Brian Kingsbury and George Saon and Vernon Austel and Upendra Chaudhari},
  title={{Parallel deep neural network training for LVCSR tasks using blue gene/Q}},
  booktitle={Proc. Interspeech 2014},