ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Robust speaker adaptation of continuous density HMMS using multilayer perceptron network

Mikko Harju, Petri Salmela, Olli Viikki, Mikko Lehtokangas, Jukka Saarinen

The performance of global affine and nonlinear trans-formations for speaker adaptation in a hidden Markov model (HMM) speech recognition system are compared in this paper. The nonlinear transformation was obtained with a multilayer perceptron network (MLP) which was trained during the adaptation process to transform the mean vectors of the HMMs such that the output proba-bilities of the HMMs for the adaptation utterances were maximized. The performance of the MLP adaptation method was compared to the maximum likelihood linear regression (MLLR) adaptation procedure. Both of these methods were tested in a connected digit speech recogni-tion system using multi-environment models. The results show that the nonlinear MLP transformation clearly out-performs MLLR in terms of adaptation speed. Moreover, the performance of MLP adaptation with larger amounts of data was comparable to the MLLR performance.