ISCA Archive Eurospeech 1995
ISCA Archive Eurospeech 1995

An RNN based speech recognition system with discriminative training

Tan Lee, P. C. Ching, L. W. Chan

In our previous work [1], a novel method of utilizing a set of fully connected recurrent neural networks (RNNs) for speech modeling has been proposed. Despite the effectiveness of the RNN model in characterizing individual speech units, the system performs less satisfactorily for speech recognition due to poor discrimination between models. In this paper, an efficient discriminative training procedure is developed for the RNN based recognition system. By using discriminative training, each RNN speech model is adjusted to reduce its distance from the designated speech unit while increase distances from the others. In addition, a duration-screening process is introduced to enhance the discriminating power of the recognition system. Speaker-dependent recognition experiments have been carried out for 1) 11 isolated Cantonese digits, 2) 58 very confusing Cantonese CV syllables, and 3) 20 English isolated words. The recognition rates attained are 90.9%, 86.7% and 93.5% respectively.