ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Federated Learning with Feature Space Separation for Speaker Recognition

Ying Meng, Zhihua Fang, Liang He

The performance of deep speaker models relies on large-scale, high-quality datasets. To improve the generalization of speaker models, federated learning can be used to share the knowledge learned by local models in private data. However, traditional federated learning methods have mapping conflict problems between local models. In this paper, we propose a federal speaker recognition method with feature space separation. Specifically, we introduce feature anchors for each speaker and transmit them when the client and server communicate. Local models use anchor loss to constrain the feature distribution of private data to avoid mapping conflict. In addition, we introduce a dynamic weight aggregation method when the server aggregates the local models, which amplifies the weights of well-performed local models to achieve a better aggregation effect. Extensive experiments and analyses in VoxCeleb and CN-Celeb demonstrate the effectiveness of our proposed method.