ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Combining hierarchical classification with frequency weighting for the recognition of eating conditions

Johannes Wagner, Andreas Seiderer, Florian Lingenfelser, Elisabeth André

Though parents regularly remind their children not to do so, talking while eating is a typical everyday situation automatic speech analysis systems should be able to deal with. The Paralinguistic Eating Condition (EC) Challenge at Interspeech 2015 sets the task to classify whether a speaker is eating or not, and if so, which type of food the speaker is currently tasting. The approach we follow in this paper is rather unusual: instead of suppressing the influence of noise to enhance the intelligibility of a spoken message, we try to emphasize the noisy parts of the spectrum to improve the recognition of food classes. To allow for a fine-grained adaption to the characteristic spectrum of single food types we adopt a hierarchical tree structure and decompose the classification task into a sequence of binary decisions. At each node we apply frequency-dependent weighting to tune the spectrum to the involved target classes. With our approach we are able to improve results in a 7-class recognition problem (6 types of food and no food) by more than 7% on the training set (using leave-one-eater-out cross validation) and 4% on the test set, respectively.