ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Learning intra-speaker model parameter correlations from many short speaker segments

Anne K. Kienappel

Very rapid speaker adaptation algorithms, such as eigenvoices or speaker clustering, typically rely on learning intra-speaker correlations of model parameters from the training data. On the base of this a-priori knowledge, many model parameters can be successfully adapted on the basis of few observations. However, eigenvoice training or speaker clustering is non-trivial with training databases containing many short speaker segments, where for each speaker the available data to detect intra-speaker correlations is sparse. We have trained eigenvoices that yield a small but significant word error rate reduction in on-line adaptation (i.e. self adaptation) for a telephony database with on average only 5 seconds of speech per speaker in training and test data.