Convolutive non-negative matrix factorization (CNMF) and its sparse version, convolutive non-negative sparse coding (CNSC), exhibit great success in speech processing. A particular limitation of the current CNMF/CNSC approaches is that the convolution ranges of the bases in learning are identical, resulting in patterns covering the same time-span. This is obvious unideal as most of sequential signals, for example speech, involve patterns with a multitude of time spans. This paper extends the CNMF/CNSC algorithm and presents a heterogeneous learning approach which can learn bases with non-uniformed convolution ranges. The validity of this extension is demonstrated with a simple speech separation task.
Index Terms: non-negative matrix factorization, sparse coding, speech processing