We present an algorithm for augmenting the shape of the vocal tract
using 3D static and 2D dynamic speech MRI data. While static 3D images
have better resolution and provide spatial information, 2D dynamic
images capture the transitions. The aim of this work is to combine
strong points of these two types of data to obtain better image quality
of 2D dynamic images and extend the 2D dynamic images to the 3D domain.
To produce a 3D dynamic consonant-vowel (CV) sequence, our algorithm
takes as input the 2D CV transition and the static 3D targets for C
and V. To obtain the enhanced sequence of images, the first step is
to find a transformation between the 2D images and the mid-sagittal
slice of the acoustically corresponding 3D image stack, and then find
a transformation between neighbouring sagittal slices in the 3D static
image stack. Combination of these transformations allows producing
the final set of images. In the present study we first examined the
transformation from the 3D mid-sagittal frame to the 2D video in order
to improve image quality and then we examined the extension of the
2D video to the 3rd dimension with the aim to enrich spatial information.