ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

Towards Self-Attention Understanding for Automatic Articulatory Processes Analysis in Cleft Lip and Palate Speech

Ilja Baumann, Dominik Wagner, Maria Schuster, Korbinian Riedhammer, Elmar Noeth, Tobias Bocklet

Cleft lip and palate (CLP) speech presents unique challenges for automatic phoneme analysis due to its distinct acoustic characteristics and articulatory anomalies. We perform phoneme analysis in CLP speech using a pre-trained wav2vec 2.0 model with a multi-head self-attention classification module to capture long-range dependencies within the speech signal, thereby enabling better contextual understanding of phoneme sequences. We demonstrate the effectiveness of our approach in the classification of various articulatory processes in CLP speech. Furthermore, we investigate the interpretability of self-attention to gain insights into the model's understanding of CLP speech characteristics. Our findings highlight the potential of the self-attention mechanisms for improving automatic phoneme analysis in CLP speech, paving the way for enhanced diagnostics, adding interpretability for therapists and affected patients.