ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

PLDA modeling in i-vector and supervector space for speaker verification

Ye Jiang, Kong Aik Lee, Zhenmin Tang, Bin Ma, Anthony Larcher, Haizhou Li

In this paper, we advocate the use of uncompressed form of i-vector. We employ the probabilistic linear discriminant analysis (PLDA) to handle speaker and session variability for speaker verification task. An i-vector is a low-dimensional vector containing both speaker and channel information acquired from a speech segment. When PLDA is used on i-vector, dimension reduction is performed twice . first in the i-vector extraction process and second in the PLDA model. Keeping the full dimensionality of i-vector in the supervector space for PLDA modeling and scoring would avoid unnecessary loss of information. The drawback of using PLDA on uncompressed i-vector is the inversion of large matrices, which we show can be solved rather efficiently by portioning large matrix into smaller blocks. We also introduce the Gaussianized rank-norm, as an alternative to whitening, for feature normalization prior to PLDA modeling.

Index Terms: speaker verification, i-vector, probabilistic LDA