ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Cross-lingual articulatory feature information transfer for speech recognition using recurrent progressive neural networks

Mahir Morshed, Mark Hasegawa-Johnson

A system for the lateral transfer of information from end-to-end neural networks recognizing articulatory feature classes to similarly structured networks recognizing phone tokens is here proposed. The system connects recurrent layers of feature detectors pre-trained on a base language to recurrent layers of a phone recognizer for a different target language, this inspired primarily by the progressive neural network scheme. Initial experiments used detectors trained on Bengali speech for four articulatory feature classes—consonant place, consonant manner, vowel height, and vowel backness—attached to phone recognizers for four other Asian languages (Javanese, Nepali, Sinhalese, and Sundanese). While these do not currently suggest consistent performance improvements across different low-resource settings for target languages, irrespective of their genealogic or phonological relatedness to Bengali, they do suggest the need for further trials with different language sets, altered data sources and data configurations, and slightly altered network setups.