ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

From text to formants — indirect model for trajectory prediction based on a multi-speaker parallel speech database

Kálmán Abari, Tamás Gábor Csapó, Bálint Pál Tóth, Gábor Olaszy

An indirect model is presented, capable of estimating formant trajectories from text only (Text-to-Formants, TTF). The result is a phonetically correct formant trajectory flow of any virtual speech signal, i.e. one that has never been uttered. The focus is on the pattern forms inside the given sound, taking into account the sound environment (up to quinphone), and not on individual formant value measurements. The model is based on a multi-speaker parallel speech database with precise manual corrections and a HMM-based formant trajectory predictor. The validation of the TTF model shows that formant trajectories can be predicted with good accuracy from text. The model indirectly gives information about a theoretically possible articulation flow of the sentence. Thus it gives a general `formantprint' of the language.