ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

The sub-band cepstrum as a tool for locating local spectral regions of phonetic sensitivity: A first attempt with multi-speaker vowel data

Michael Lambropoulos, Frantz Clermont, Shunichi Ishihara

Phonetic information is well-known to be unevenly encoded throughout vowel spectra, implying the existence of sub-band regions sensitive to that information. This work exploits band-limited cepstral coefficients (BLCCs) to locate such regions and quantify their sensitivity through vowel classification. BLCCs are acoustic parameters representing sub-band spectra; their extraction involves a linear transformation of full-band CCs with flexible sub-band selection. Here, 18 sub-bands spanning the full band [0-4 kHz] and their respective BLCCs are used to classify Japanese vowels from 306 native male speakers. Classification accuracy is high in sub-bands where phonetic differences between vowels are the most significant. Such sub-bands are mainly in the low frequency range as expected, but do not exclusively align with formant regions. These findings suggest that BLCCs are potentially very useful for gaining detailed phonetic insights with flexible sub-band focus and efficient computation.