ISCA Archive ISCSLP 2006
ISCA Archive ISCSLP 2006

A New Approach for Speech/Music Discrimination Based on Cepstral Distance

Mu-Yeol Choi, Seul-Han Park, Hwa Jeon Song, Hyung Soon Kim

Discrimination between speech and music is important in many multimedia applications. In this paper, focusing on the spectral change characteristics of speech and music, we propose a new method based on cepstral distances to improve performance of speech/music classification. Instead of using cepstral distances between the frames with fixed interval, we employ the minimum of cepstral distances among close frames. In addition, we exclude short pause blocks from computing cepstral distances, to prevent the short pause segments in speech from being misclassified into music due to their small cepstral distances. The experimental results show that the proposed parameter outperforms other parameters. In comparison with conventional cepstral distances, taking the minimum of cepstral distances yields the error rate reduction of 60%. Also we achieve 20% additional error rate reduction by excluding short pause segments from audio signal.