ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Japanese broadcast news transcription

Long Nguyen, Xuefeng Guo, Richard Schwartz, John Makhoul

In this paper, we describe the on-going development of a Japanese Broadcast News Transcription system at BBN Technologies. This is a collaboration between BBN and NHK to use automatic speech recognition technology to provide live closed caption for NHK’s TV news programs in Japan. We describe what the NHK Broadcast News Corpus comprises and how we adopted transcription technology developed for Hub-4 English broadcast news task to achieve an overall word error rate (WER) of less than 5% for Japanese TV news programs. We also report on how we obtained 30-50% relative WER reduction for weather forecast and sports news by the use of micro-domain lexicons and language models.