ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

A robust approach to mining repeated sequence in audio stream

Jiansong Chen, Lei Zhu, Bailan Feng, Peng Ding, Bo Xu

In multimedia stream, repeated sequences, e.g., commercials, jingles, usually imply potentially significant information. Therefore, mining repeated sequence is an important approach to analyzing multimedia content. This paper reports on a robust unsupervised technique of discovering repeated sequence in audio stream. Different from former research, our approach transforms the repeated sequence detection task into a Hidden Markov Model (HMM) decoding problem in a similarity trellis. To resist the false and missing matches in real application, we present a soft definition of repeated sequence, termed as maximal loosely repeated sequence (MLRS), as the objective for detection, and use a Viterbi-like algorithm to mine all the MLRSs in the stream. In addition, we propose a novel metric to evaluate the repeated sequence detection algorithm. Experiments both on simulated data and real broadcast data demonstrate the effectiveness of our method.