ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Model-assisted Lexical Tone Evaluation of three-year-old Chinese-speaking Children by also Considering Segment Production

Shu-Chuan Tseng, Yi-Fen Liu, Xiang-Li Lu

This paper presents a hybrid workflow for lexical tone evaluation of 3-year-old Chinese-speaking children. The speech data of 123 children were phonetically transcribed for phoneme accuracy as well as perceptually evaluated for tone accuracy by human judgement. A transformer-based tone model with a BERT input architecture was built using the speech data and tested on twelve children with low speech performance. The accuracy rates between the judged tones and the predicted tones output by our model were high for the overall evaluation. More consistent patterns between judged and predicted tones were observed for high-register Tone 1 and Tone 4 for low-register Tone 2 and Tone 3. We also found that a child's tone production ability is consistently reflected in relation to consonants, vowels, and syllables. Tone accuracy is more related to vowel accuracy than consonant accuracy. In particular, the most diverse differences in tone, consonant, and vowel accuracies were observed for Tone 3.