ISCA Archive Interspeech 2004
ISCA Archive Interspeech 2004

Measuring the perceived importance of time- and frequency-divided speech blocks for transmitting over packet networks

Akitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo

This paper presents a way to calculate the perceived importance of speech segments as a single value criterion, using a linear regression model. Unlike the commonly used voice activity detection (VAD) algorithms, this method allows us to obtain a finer priority granularity of speech segments. This can be used in conjunction with frequency scalable speech coding techniques and IP QoS techniques to achieve efficient and quality-controlled voice transmission. A simple linear regression model is used to calculate the estimated mean opinion score (MOS) of the various cases of missing speech segments.