ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

SWiBE: A Parameterized Stochastic Diffusion Process for Noise-Robust Bandwidth Expansion

Yin-Tse Lin, Shreya G. Upadhyay, Bo-Hao Su, Chi-Chun Lee

Speech recordings frequently encounter a variety of distortions, making the task of eliminating them essential yet challenging. In this study, leveraging the current success of score-based generative modeling (SGM), we propose a novel noise-robust bandwidth expansion (BWE) framework based on an innovative parameterized stochastic diffusion process, achieved through stepwise bandwidth expansion in the spectrogram. Our proposed Step-Wised Bandwidth Expansion (SWiBE) method outperforms baseline approaches over considered metrics, including the current state-of-the-art noise-robust BWE model and various diffusion and GAN-based models. Moreover, we analyze the interaction between the hyperparameters and performance across different aspects including perceptual quality and spectral reconstruction. Our findings reveal that the score-based model manifests distinct characteristics under varying parameterizations.