ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

Sign Value Constraint Decomposition for Efficient 1-Bit Quantization of Speech Translation Tasks

Nan Chen, Yonghe Wang, Feilong Bao

Speech-to-text translation is vital in converting speech input to text output in different languages. While combining speech and machine translation pre-trained models enhances translation quality, it also escalates the number of parameters, resulting in substantial hardware costs for model training and deployment. We propose a 1-bit quantized model based on Sign Value Constraint Decomposition (SVCD) for linear layers to address this challenge. SVCD approximates the weight matrix of the linear layer as a sign matrix and two trainable vectors, preserving higher information capacity at a minor space cost. Additionally, we utilize knowledge distillation to transfer the capability of the original fine-tuned model to the quantized model. The experimental results demonstrate the critical importance of the decoder's attention module in the performance of the quantized speech translation model. Our code is available at https://github.com/myaxxxxx/onebit-st.