ISCA Archive Interspeech 2020
ISCA Archive Interspeech 2020

STC-Innovation Speaker Recognition Systems for Far-Field Speaker Verification Challenge 2020

Aleksei Gusev, Vladimir Volokhov, Alisa Vinogradova, Tseren Andzhukaev, Andrey Shulipa, Sergey Novoselov, Timur Pekhovsky, Alexander Kozlov

This paper presents speaker recognition (SR) systems submitted by the Speech Technology Center (STC) team to the Far-Field Speaker Verification Challenge 2020. SR tasks of the challenge are focused on the problem of far-field text-dependent speaker verification from single microphone array (Track 1), far-field text-independent speaker verification from single microphone array (Track 2) and far-field text-dependent speaker verification from distributed microphone arrays (Track 3).

In this paper, we present techniques and ideas underlying our best performing models. A number of experiments on x-vector-based and ResNet-like architectures show that ResNet-based networks outperform x-vector-based systems. Submitted systems are the fusions of ResNet34-based extractors, trained on 80 Log Mel-filter bank energies (MFBs) post-processed with U-net-like voice activity detector (VAD). The best systems for the Track 1, Track 2 and Track 3 achieved 5.08% EER and 0.500 Cmindet, 5.39% EER and 0.541 Cmindet and 5.53% EER and 0.458 Cmindet on the challenge evaluation sets respectively.