ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Online speaker adaptation with pre-computed FMLLR transformations

Volker Fischer, Siegfried Kunzmann

This paper presents a memory efficient single pass speech recognizer that makes use of pre-computed FMLLR transformations for online speaker adaptation. For that purpose we apply unsupervised segment clustering to the training corpus, create a transformation matrix for each cluster, and train a text-independent Gaussian mixture classifier for cluster selection during runtime. We use the RWTH Aachen University open source speech recognition toolkit for evaluation and compare the results to a standard speaker adaptive two pass decoding strategy. Results indicate that the method improves single pass recognition in VTLN feature space almost without overhead due to cluster selection, and show a relative improvement of up to 15 percent over speaker adaptative decoding, if only little data is available for unsupervised online adaptation.