ISCA Archive HSCR 2025
ISCA Archive HSCR 2025

Impact of background music removal using deep neural models on formant measurements in historical newsreel corpora

Juliusz Cęcelewski, Cédric Gendrot

This study examines the impact of deep neural network models, for background music removal on automated formant extraction accuracy (F1, F2, F3, and F4). The dataset consists of a 1956 Polish news chronicle, with varying audio quality, background noise, and orchestral accompaniment. Two Control Groups included the original and denoised audio, both preserving the background music. The Experimental Groups featured four versions of the same recording, processed with different separation models. Results show higher variability and detection errors in the Control Groups, particularly for the front vowel /i/ and the mid-open nasalized rounded back vowel /ɔ̃/, due to harmonic interference from the background music. In contrast, formant values in the Experimental Groups remained stable across processing methods, with no significant differences among the DNN models used, closely aligning with reference values for Polish vowels.