ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Combining features via LDA in speaker recognition

Z. P. Sun, J. S. Mason

This paper discusses cepstial feature combinations via linear discriminant analysis (LDA) in the context of automatic speaker identification (ASI). Two static cepstral features are considered, namely standard MFCC and a subband filtered form derived via linear prediction known as RASTA-PLP. These two are compared along with their first order dynamic forms as both single and combined feature sets. LDA is shown to provide a useful means of combining (dissimilar) feature sets and permitting a direct trade-off between the number of coefficients and ASI performance, particularly when testing under noisy conditions. Also, the importance of pre-normalisation is demonstrated. It is shown that in the case of individual features, the two static forms give the best performance in clean conditions, with a cross-over to the two dynamic forms being better in the region of SNR=15dB. The LDA combination of the two static forms gives the best overall results under both clean and noisy conditions.

Keywords: speaker recognition, cepstral feature combinations, linear discriminant analysis, robustness