Sound channel video indexing

Claude Montacié, Marie-José Caraty

We present in this paper preliminary results using speaker recognition and speech recognition techniques, designed at LIP6, to index audio data of video movies. The assumption that only one person is speaking at the same time is made. In a first approach, we work on dialogue unsupervised indexing using speaker recognition techniques. For this purpose, we develop Silence/Noise/Music/Speech detection algorithms in order to cut audio data in segments that we hope to be homogeneous in terms of speaker appartenance. In a second approach, we develop a supervised audio data indexing method knowing the movie script.

doi: 10.21437/Eurospeech.1997-620

