ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

CS-CTCSCONV1D: Small footprint speaker verification with channel split time-channel-time separable 1-dimensional convolution

Linjun Cai, Yuhong Yang, Xufeng Chen, Weiping Tu, Hongyang Chen

We present an efficient small-footprint network for speaker verification. We start by introducing the bottleneck to the QuartzNet model. Then we proposed a Channel Split Time Channel-Time Separable 1-dimensional Convolution (CS-CTCSConv1d) module, yielding stronger performance over the State-Of-The-Art small footprint speaker verification system. We apply knowledge distillation to further improve performance to learn better speaker embedding from the large model. We evaluate the proposed approach on Voxceleb dataset, obtaining better performances concerning the baseline method. The proposed model takes only 238.9K parameters to outperform the baseline system by 10% relatively in equal error rate (EER).