This paper discusses cepstial feature combinations via linear discriminant analysis (LDA) in the context of automatic speaker identification (ASI). Two static cepstral features are considered, namely standard MFCC and a subband filtered form derived via linear prediction known as RASTA-PLP. These two are compared along with their first order dynamic forms as both single and combined feature sets. LDA is shown to provide a useful means of combining (dissimilar) feature sets and permitting a direct trade-off between the number of coefficients and ASI performance, particularly when testing under noisy conditions. Also, the importance of pre-normalisation is demonstrated. It is shown that in the case of individual features, the two static forms give the best performance in clean conditions, with a cross-over to the two dynamic forms being better in the region of SNR=15dB. The LDA combination of the two static forms gives the best overall results under both clean and noisy conditions.
Keywords: speaker recognition, cepstral feature combinations, linear discriminant analysis, robustness