ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation

Ji Ming, Timothy J. Hazen, James R. Glass

This paper considers the recognition of speech given in the form of two mixed sentences, spoken by the same talker or by two different talkers. The database published on the ICSLPÂ’2006 website for Two-Talker Speech Separation Challenge is used in the study. A system that recognizes and reconstructs both sentences from the given mixture is described. The system involves a combination of several different techniques, including a missing-feature approach for improving crosstalk/noise robustness, Wiener filtering for speech restoration, HMM-based speech reconstruction, and speakerdependent/- independent modeling for speaker/speech recognition. For clean speech recognition, the system obtained a word accuracy rate 96.7%. For the two-talker speech separation challenge task, the system obtained 81.4% at 6 dB TMR (target-to-masker ratio) and 34.1% at -9 dB TMR.


doi: 10.21437/Interspeech.2006-24

Cite as: Ming, J., Hazen, T.J., Glass, J.R. (2006) Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation. Proc. Interspeech 2006, paper 1377-Mon1WeS.6, doi: 10.21437/Interspeech.2006-24

@inproceedings{ming06_interspeech,
  author={Ji Ming and Timothy J. Hazen and James R. Glass},
  title={{Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation}},
  year=2006,
  booktitle={Proc. Interspeech 2006},
  pages={paper 1377-Mon1WeS.6},
  doi={10.21437/Interspeech.2006-24},
  issn={2958-1796}
}