ISCA Archive Interspeech 2020
ISCA Archive Interspeech 2020

Malayalam-English Code-Switched: Grapheme to Phoneme System

Sreeja Manghat, Sreeram Manghat, Tanja Schultz

Grapheme to phoneme conversion is an integral aspect of speech processing. Conversational speech in Malayalam — a low resource Indic language has inter-sentential, intra-sentential code-switching as well as frequent intra-word code-switching with English. Monolingual G2P systems cannot process such special intra-word code-switching scenarios. A G2P system which can handle code-switching developed based on Malayalam-English code-switch speech and text corpora is presented. Since neither Malayalam nor English are phonetic subset of each other, the overlapping phonemes for English–Malayalam are identified and analysed. Additional rules used to handle special cases of Malayalam phonemes and intra-word code-switching in the G2P system is also presented specifically.