ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

A data-driven model of acoustic speech intelligibility for optimization-based models of speech production

Benjamin Elie, Juraj Simko, Alice Turk

This paper presents a data-driven model of intelligibility which is intended to be used in an optimization-based model of speech production. The BiLSTM-based model is trained as a phoneme classifier and takes a sequence of real articulatory trajectories as input and returns the probability of phonemes over time. The optimization minimizes a cost function which is the weighted sum of the conflicting demands of being intelligible and least articulatory effort. The data-driven intelligibility model presented in this paper is used to compute the intelligibility score. Simulations support Lindblom's hypo- and hyper-articulation theory of speech, as the degree of hyper-articulation of speech can be modified and tuned along a continuum by balancing the importance given to both requirements of intelligibility and least articulatory effort.