ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription

Alon Vinnikov, Amir Ivry, Aviv Hurvitz, Igor Abramovski, Sharon Koubi, Ilya Gurvich, Shai Peer, Xiong Xiao, Benjamin Martinez Elizalde, Naoyuki Kanda, Xiaofei Wang, Shalev Shaer, Stav Yagev, Yossi Asher, Sunit Sivasankaran, Yifan Gong, Min Tang, Huaming Wang, Eyal Krupka

We introduce the first Natural Office Talkers in Settings of Far-field Audio Recordings (NOTSOFAR) Challenge, datasets, and baseline system. The challenge focuses on distant speaker diarization and automatic speech recognition (DASR) in meeting scenarios, with single-channel and known-geometry multi-channel tracks, using a single device. We launch two new datasets: First, a benchmark dataset of 280 English meetings, averaging 6 minutes each, capturing a broad spectrum of acoustic and conversational patterns across 30 rooms with 4-8 attendees. Second, a 1000-hour simulated training dataset, synthesized for real-world generalization, incorporating 15,000 real acoustic transfer functions. The NOTSOFAR-1 Challenge aims to advance research in the field of DASR, providing key resources to unlock the potential of data-driven methods, which we believe are currently constrained by the absence of comprehensive high-quality training and benchmark datasets.