ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Advanced time shrinking using a drop classifier based on codec features

Jochen Issing, Nikolaus Färber, Reinhard German

We present an integrated approach of full-band audio time scale modification for Voice over IP communication. The concept is based on a low complexity adaptive playout method that uses frame dropping and audio concealment for time shrinking and stretching, respectively. The existing version of this method is improved using a classifier that assists in choosing which audio frames can be dropped with the least subjective impact on audio quality. To maintain low complexity, we exclusively use audio signal features that are available in the audio codec. The classification of audio frames improves audio quality of the existing method without classification by 0.5 Mean Opinion Score points while requiring significantly less computational complexity by a factor of ca 10^4.