ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Extending the Fongbe to French Speech Translation Corpus: resources, models and benchmark

D. Fortuné Kponou, Salima Mdhaffar, Fréjus A. A. Laleye, Eugène C. Ezin, Yannick Estève

This paper introduces FFSTC 2, an expanded version of the existing Fongbe-to-French speech translation corpus, addressing the critical need for resources in African dialects for speech recognition and translation tasks. We extended the dataset by adding 36 hours of transcribed audio, bringing the total to 61 hours, thereby enhancing its utility for both automatic speech recognition (ASR) and speech translation (ST) in Fongbe, a low-resource language. Using corpus, we develop both cascade and end-to-end speech translation systems. Our models employ AfriHuBERT and HuBERT147 as encoders, and NLLB and mBART as decoders. We introduce a diacritic-substitution technique for both ASR and Machine Translation (MT) which, yields a BLEU score of 37.23 compared to 39.60 for the fully diacritized configuration. Among the evaluated end-to-end architectures, AfriHuBERT-NLLB with data augmentation attains the highest BLEU score of 26.32.