ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

A discriminative performance metric for GMM-UBM speaker identification

Omid Dehzangi, Bin Ma, Eng Siong Chng, Haizhou Li

Universal background model based Gaussian mixture modeling (GMM-UBM) approach is a widely used method for speaker identification, where a GMM model is used to characterize a specific speaker’s voice. The estimation of model parameters is generally performed based on the maximum likelihood (ML) or maximum a posteriori (MAP) criteria. However, interspeaker information to discriminate between different speakers is not taken into account in ML and MAP parameter estimation. To overcome this limitation, we design a discriminative performance metric to capture interspeaker variabilities leading to improve the classification performance of the GMM-UBM system. A learning algorithm is presented to tune the Gaussian mixture weights by optimizing the detection performance of GMM classifiers. We design an objective function to directly relate the model parameters to the performance metric. The comparative study of the proposed method is done with the GMM-UBM system on the 2001 NIST SRE corpus. Experimental results demonstrate that the proposed learning algorithm considerably improves the GMM-UBM system on speaker identification.