When humans communicate with each other, they use not only speech but also several gestures such as facial expression, gaze, head movements, hand movements, and body posture. In this paper, we propose a novel method for recognizing head gestures that accompany speech. The proposed method tracks head movements that accompany speech by localizing the mouth position with a microphone array system. The proposed system is based only on acoustic information and never utilizes visual information. We also propose a recognition method for the mouth-position trajectory, in which Higher- Order Local Cross Correlation is applied to the trajectory. The recognition accuracy of the proposed method was on an average 90.25% for nineteen kinds of head gesture recognition tasks conducted in an open test manner, which outperformed the Hidden Markov Model-based method.
Index Terms: head gesture recognition, higher-order local cross correlation, microphone array