ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Recognition of noisy speech by composition of hidden Markov models

Franck Martin, Kiyohiro Shikano, Yasuhiro Minami

This paper proposes an algorithm for recognizing noisy-speech while avoiding the tedious training of noisy-speech HMMs. HMM composition combines a noise-source HMM and a phoneme HMM into one noise-added phoneme HMM. The speech recognizer is based on LPC cepstrum analysis. In the first set of speaker-dependent experiments consisting in recognizing 23 Japanese phonemes with a variety of stationary and nonstationary noises with signal-to-noise ratios ranging from 0 dB to 20 dB, the algorithm reduced the error of the phoneme-recognition rate by more than 75%. In the second set of speaker-dependent experiments consisting in recognizing continuous speech sentences, the composed HMMs could be obtained very rapidly and gave similar recognition rates to those of phoneme HMM models trained by using a large noise-added speech database. The efficiency, flexibility of the algorithm and its adaptability to new noises and to various SNRs make it a suitable basis for a real-time speech recognizer resistant to noise.

Keywords: noisy speech, HMM composition