ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Real-time control of a DNN-based articulatory synthesizer for silent speech conversion: a pilot study

Florent Bocquelet, Thomas Hueber, Laurent Girin, Christophe Savariaux, Blaise Yvert

This article presents a pilot study on the real-time control of an articulatory synthesizer based on deep neural network (DNN), in the context of silent speech interface. The underlying hypothesis is that a silent speaker could benefit from real-time audio feedback to regulate his/her own production. In this study, we use 3D electromagnetic-articulography (EMA) to capture speech articulation, a DNN to convert EMA to spectral trajectories in real-time, and a standard vocoder excited by white noise for audio synthesis. As shown by recent literature on silent speech, adaptation of the articulo-acoustic modeling process is needed to account for possible inconsistencies between the initial training phase and practical usage conditions. In this study, we focus on different sensor setups across sessions (for the same speaker). Model adaptation is performed by cascading another neural network to the DNN used for articulatory-to-acoustic mapping. The intelligibility of the synthetic speech signal converted in real-time is evaluated using both objective and perceptual measurements.