Indian languages are broadly classified as Indo-Aryan or Dravidian.
The basic set of phones is more or less the same, varying mostly in
the phonotactics across languages. There has also been borrowing of
sounds and words across languages over time due to intermixing of cultures.
Since syllables are fundamental units of speech production and Indian
languages are characterised by syllable-timed rhythm, acoustic analysis
of syllables has been carried out.
In this paper, instances
of common and most frequent syllables in continuous speech have been
studied across six Indian languages, from both Indo-Aryan and Dravidian
language groups. The distributions of acoustic features have been compared
across these languages. This kind of analysis is useful for developing
speech technologies in a multilingual scenario. Owing to similarities
in the languages, text-to-speech (TTS) synthesisers have been developed
by segmenting speech data at the phone level using hidden Markov models
(HMM) from other languages as initial models. Degradation mean opinion
scores and word error rates indicate that the quality of synthesised
speech is comparable to that of TTSes developed by segmenting the data
using language-specific HMMs.