ISCA Archive AVSP 2011
ISCA Archive AVSP 2011

Speech-driven lip motion generation for tele-operated humanoid robots

Carlos T. Ishi, Chaoran Liu, Hiroshi Ishiguro, Norihiro Hagita

In order to tele-operate the lip motion of a humanoid robot (such as android) from the utterances of the operator, we developed a speech-driven lip motion generation method. The proposed method is based on the rotation of the vowel space, given by the first and second formants, around the center vowel, and a mapping to the lip opening degrees. The method requires the calibration of only one parameter for speaker normalization, so that no other training of models is required. In a pilot experiment, the proposed audio-based method was perceived as more natural than vision-based approaches, regardless of the language.

Index Terms. lip motion, formant, humanoid robot, teleoperation, synchronization