ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

The Faetar Speech Recognition Benchmark

Michael Ong, Sean Robertson, Leo Peckham, Alba Jorquera Jimenez de Aberasturi, Paula Arkhangorodsky, Robin Huo, Aman Sakhardande, Mark Hallap, Naomi Nagy, Ewan Dunbar
We introduce the Faetar Automatic Speech Recognition Benchmark, a benchmark corpus designed to push the limits of current approaches to low-resource speech recognition. Faetar, a Franco-Provençal variety spoken primarily in Italy, has no standard orthography, has virtually no existing textual or speech resources other than what is included in the benchmark, and is quite different from other forms of Franco-Provençal. The corpus comes from field recordings, most of which are noisy, for which only 5 hours have matching transcriptions, and for which transcriptions are inconsistent. The corpus contains an additional 20 hours of unlabelled speech. We report baseline results from multilingual speech foundation models with a best phone error rate of 30.5%, using a pipeline that continues pre-training on the foundation model using the unlabelled set.