ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

JIS: A Speech Corpus of Japanese Idol Speakers with Various Speaking Styles

Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko

We construct Japanese Idol Speech Corpus (JIS) to advance research in speech generation AI, including text-to-speech synthesis (TTS) and voice conversion (VC). JIS will facilitate more rigorous evaluations of speaker similarity in TTS and VC systems since all speakers in JIS belong to a highly specific category: “young female live idols" in Japan, and each speaker is identified by a stage name, enabling researchers to recruit listeners familiar with these idols for listening experiments. With its unique speaker attributes, JIS will foster compelling research, including generating voices tailored to listener preferences—an area not yet widely studied. JIS will be distributed free of charge to promote research in speech generation AI, with usage restricted to non-commercial, basic research. We describe the construction of JIS, provide an overview of Japanese live idol culture to support effective and ethical use of JIS, and offer a basic analysis to guide application of JIS.

Erratum

6. Acknowledgement

This work was supported by JST CREST Grant Number JPMJCR19A3.