ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Generalized variable parameter HMMs based acoustic-to-articulatory inversion

Xurong Xie, Xunying Liu, Lan Wang, Rongfeng Su

Acoustic-to-articulatory inversion is useful for a range of related research areas including language learning, speech production, speech coding, speech recognition and speech synthesis. HMM-based generative modelling methods and DNN-based approaches have become dominant approaches in recent years. In this paper, a novel acoustic-to-articulatory inversion technique based on generalized variable parameter HMMs (GVP-HMMs) is proposed. It leverages the strengths of both generative and neural network based modelling frameworks. On a Mandarin speech inversion task, a tandem GVP-HMM system using DNN bottleneck features as auxiliary inputs significantly outperformed the baseline HMM, multiple regression HMM (MR-HMM), DNN and deep mixture density network (MDN) systems by 0.20mm, 0.16mm, 0.12mm and 0.10mm respectively in terms of electromagnetic articulography (EMA) root mean square error (RMSE).