ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

Automatic Assessment of Dysarthria using Speech and synthetically generated Electroglottograph signal

Fathima Zaheera, Supritha Shetty, Gayadhar Pradhan, Deepak K T

The formants are flat and dispersed in the short-term magnitude spectra (STMS) of dysarthric speech. This paper investigates the possibility of enhancing the performance of an automated dysarthric assessment by exploiting the complementary perceptual cues present in the STMS of speech and synthetically generated Electroglottograph (EGG) signal. To capture the complementary information through a single acoustic feature representation, the log Mel filterbank energy (LMFE) computed from both kinds of signal is averaged. The resulting LMFE is then used for the computation of Mel frequency cepstral coefficients (MFCCs). The analytical and experimental results presented on the UA-Speech corpus validate the efficacy of the proposed approach. For the x-vector-based automated dysarthric assessment system the accuracy and F1 score improved from 73% and 64% to 78% and 71%, respectively, in a speaker and text-independent mode when the MFCCs are computed from the averaged LMFE.