This paper describes the development of the automatic speech recognition
(ASR) system for the submission to the VOiCES from a Distance Challenge
2019. In this challenge, we focused on the fixed condition, where the
task is to recognize reverberant and noisy speech based on a limited
amount of clean training data. In our system, the mismatch between
the training and testing conditions was reduced by using multi-style
training where the training data was artificially contaminated with
different reverberation and noise sources. Also, the Weighted Prediction
Error (WPE) algorithm was used to reduce the reverberant effect in
the evaluation data. To boost the system performance, acoustic models
of different neural network architectures were trained and the respective
systems were fused to give the final output. Moreover, an LSTM language
model was used to rescore the lattice to compensate the weak n-gram
model trained from only the transcription text. Evaluated on the development
set, our system showed an average word error rate (WER) of 27.04%.
This paper also appears in session Wed-SS-7-3.