ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

The Role of Voiced Consonant Duration in Sung Vowel-Consonant and Consonant-Vowel Recognition

Allan Vurma, Einar Meister, Lya Meister, Jaan Ross, Marju Raju, Veeda Kala, Tuuri Dede

This study explores the impact of consonant duration on the intelligibility of sung consonants (/l/, /m/, /n/, and /v/) across various acoustic settings, pitch levels, and background noise conditions. Forty-two participants (13 male, 29 female; aged 16-69) completed recognition tests involving CV and VC segments containing the vowel /a/, sung and spoken by a mezzo-soprano and a baritone. Consonant durations ranged from 0 ms to 200 ms, with artificial reverberation or brown noise added to some stimuli to simulate performance environments. GLMM revealed that recognition was poorer at high pitches, in reverberant acoustics, and with accompaniment, particularly for VC segments. Extending consonant duration from 20 ms to 200 ms consistently improved recognition by up to 25 percentage points. At low pitches, recognition exceeded chance even when the stationary part of the consonant was absent; 20 ms was sufficient for 95% recognition of spoken CVs, except in the presence of noise.