We have evaluated 17 variants of 6 dissimilarities: PLOMP, Log Likelihood Ratio, Cepstrum, Mel Frequency Cepstrum Coefficients, Weighted Slope Metric, and Spectral Peaks Adjustment derived from FFT and/or LPC analysis with two types of integration (KLATT and ZWICKER). We used as "references" synthetic and natural vocalic stimuli for which we have a phonetic structural representation.
The intervocalic dissimilarities were used as input for a multidimensional analysis (KRUSKAL) to obtain an output space that we compared with the acoustic one from the data. The appraisal of the 2 spaces - the first one corresponding to F1-F2, derived from acoustic analysis and the second one rebuilt from dissimilarities as input of the Multidimensional Scaling KRUSKAL - allows us to compare dissimilarities and to make an extrinsic (phonetic) judgment on their behavior. We have used 5 criteria based on the capability of these processings to deliver vocalic dissimilarities that could be interpreted in terms of phonetic description (acoustic representation). This comparative evaluation of dissimilarities can guide better choices regarding their application to automatic recognition and also in the domain of phonetic analysis, including perceptual simulation.