ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Eigenvoices for speaker adaptation

Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua, Lloyd Goldwasser, Nancy Niedzielski, Steven Fincke, Ken Field, Matteo Contolini

We have devised a new class of fast adaptation techniques for speech recognition. These techniques are based on prior knowledge of speaker variation, obtained by applying Principal Component Analysis (PCA) or a similar technique to T vectors of dimension D derived from T speaker-dependent models. This offline step yields T basis vectors called ``eigenvoices''. We constrain the model for new speaker S to be located in the space spanned by the first K eigenvoices. Speaker adaptation involves estimating the K eigenvoice coefficients for the new speaker; typically, K is very small compared to D. We conducted mean adaptation experiments on the Isolet database. With a large amount of supervised adaptation data, most eigenvoice techniques performed slightly better than MAP or MLLR; with small amounts of supervised adaptation data or for unsupervised adaptation, some eigenvoice techniques performed much better. We believe that the eigenvoice approach would yield rapid adaptation for most speech recognition systems.