ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Improving training datasets for resource-constrained speaker recognition neural networks

Pierre-Michel Bousquet, Mickael Rouvier

Tackling the increase in complexity, which is now the main factor of improvement in deep learning, this paper proposes a new algorithm of data selection for training speaker recognition systems. The method starts from an initial training dataset, then the algorithm scans new data to determine the most useful speakers for completing the initial ones. The resulting training dataset improves the model, in terms of accuracy and ability to generalize, while maintaining the learning complexity reasonable. This algorithm is unsupervised, as it does not need any metadata on the new utterances and, therefore, compatible with self-supervised learning. By selecting only 30% of the speakers from a new database, the proposed algorithm is able to achieve very similar performances to the system with all speakers added.