This paper reports preliminary results of our effort to address the acoustic-to-articulatory inversion problem. We tested an approach that simulates speech production acquisition as a distal learning task, with acoustic signals of natural utterances in the form of MFCC as input, VocalTractLab . a 3D articulatory synthesizer controlled by target approximation models as the learner, and stochastic gradient descent as the training method. The approach was tested on a number of natural utterances, and the results were highly encouraging.