ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

A new pitch synchronous time domain phoneme recognizer using component analysis and pitch clustering

Ramon Prieto, Jing Jiang, Chi-Ho Choi

A new framework for time domain voiced phoneme recognition is shown. Each speech frame taken for training and recognition is bounded by consecutive glottal closures. A pre-processing stage is designed and implemented to model pitch synchronous frames with gaussian mixture models. Component analysis carried out on the data shows optimal performance with a very small number of components, requiring low computational power. We designed a new clustering technique that, using the pitch period, gives better results than other well known clustering algorithms like k-means.