It is well known that state-of-the-art speaker verification system using i-vector concept shows prominent performance when target speakers training and test utterances are fixed conditions: long-long as per NIST evaluation. However, most of the real-time applications of speaker verification systems are limited to different training and test durations of the speech segments. State-of-the-art speaker verification system needs to estimate some statistical parameters. The aim of this paper is to explore how to train the statistical model parameter of the state-of-the-art system while speakers training and test data are on mismatch durations. Experimental results are shown on NIST 2008 SRE for various duration of target training and test speech segments, such as 5 seconds, 10 seconds and full (5 minutes).
Index Terms: short segment, i-vector, length normalization, PLDA, speaker verification