ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Multipitch tracking using a factorial hidden Markov model

Michael Wohlmayr, Franz Pernkopf

In this paper, we present an approach to track the pitch of two simultaneous speakers. Using a well-known feature extraction method based on the correlogram, we track the resulting data using a factorial hidden Markov model (FHMM). In contrast to the recently developed multipitch determination algorithm [1], which is based on a HMM, we can accurately associate estimated pitch points with their corresponding source speakers. We evaluate our approach on the "Mocha-TIMIT" database [2] of speech utterances mixed at 0dB, and compare the results to the multipitch determination algorithm [1] used as a baseline. Experiments show that our FHMM tracker yields good performance for both pitch estimation and correct speaker assignment.

s> Wu M., Wang D. and Brown G.J., "A Multipitch Tracking Algorithm for Noisy Speech", IEEE Transactions On Speech and Audio Processing, 11(3):229-241, 2003.

Wrench A., "A multichannel/multispeaker articulatory database for continuous speech recognition research", Phonus, 5:3-17, 2000