ISCA Archive Odyssey 2008
ISCA Archive Odyssey 2008

Accent identification in the presence of code-mixing

Thomas Niesler, Febe de Wet

We investigate whether automatic accent identification is more effective for English utterances embedded in a different language as part of a mixed code than for English utterances that are part of a monolingual dialogue. Our focus is on Xhosa and Zulu, two South African languages for which code mixing with English is very common. In order to carry out our investigation, we extract English utterances from mixed-code Xhosa and Zulu speech corpora, as well as comparable utterances from an Englishonly corpus by Xhosa and Zulu mother-tongue speakers. Experiments show that accent identification is substantially more accurate for the utterances originating from the mixed-code speech. We conclude that accent identification is more successful for these utterances because accents are more pronounced for English embedded in mother-tongue speech than for English spoken as part of a monolingual dialogue by non-native speakers.