ISCA Archive Odyssey 2014
ISCA Archive Odyssey 2014

Speaker-basis Accent Clustering Using Invariant Structure Analysis and the Speech Accent Archive

Nobuaki Minematsu, Shun Kasahara, Takehiko Makino, Daisuke Saito, Keikichi Hirose

English is the only language available for international communication and is used by approximately 1.5 billions of speakers. It is also known to have a large diversity of pronunciation due to the influence of speakers' mother tongue, called accents. Our project aims at creating a global and speaker-basis map of English accents to be used in teaching and learning World Englishes (WE) as well as research studies of WE. Creating the map mathematically requires a distance matrix in terms of accents among all the speakers considered, and technically requires a method of predicting the accent distance between any pair of the speakers only by using their speech samples. The results of our first trials were presented with some technical problems found through the experiments. In this paper, recent progresses were explained with additional explanation on the invariant structure, which were omitted in our previous papers due to space of the papers. Use of the invariant structure and Support Vector Regression (SVR) shows a striking performance in predicting the accent distances in a speaker-pair-open mode but the performance is not sufficient in a speaker-open mode.