ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

Production of phrases by mechanical models of the human vocal tract

Takayuki Arai, Ryohei Suzuki, Chandler Earp, Shinya Tsuji, Keiko Ochi

Three types of mechanical models of the human vocal tract were used to produce utterances in English and Japanese. The first model, VTM-S24-1, is a type of sliding three-tube model in which the inner column only moves inside the outer straight pipe back and forth. In spite of this simple structure, we were able to produce “How are you?” with this model and a single actuator. The second model, VTM-UT-D11, is based on the model that Umeda and Teranishi made. Eleven actuators control the positions of ten plastic bars in the main vocal tract and the velopharyngeal opening to adjust the vocal-tract configuration dynamically. Not only the English phrase “How are you?” but also Japanese phrase for “Good morning” were successfully produced. Finally, the third model, VTM-UT-D6, was newly designed. In this model, there are only six movable blocks inserted in the main vocal tract and no side branch is attached. With this model, a linear cam mechanism successfully produced five vowels continuously. In addition, a rotating cam mechanism achieved “How are you?” with high intelligibility.