ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Joint filtering and factorization for recovering latent structure from noisy speech data

Colin Vaz, Vikram Ramanarayanan, Shrikanth S. Narayanan

We propose a joint filtering and factorization algorithm to recover latent structure from noisy speech. We incorporate the minimum variance distortionless response (MVDR) formulation within the non-negative matrix factorization (NMF) framework to derive a single, unified cost function for both filtering and factorization. Minimizing this cost function jointly optimizes three quantities — a filter that removes noise, a basis matrix that captures latent structure in the data, and an activation matrix that captures how the elements in the basis matrix can be linearly combined to reconstruct input data. Results show that the proposed algorithm recovers the speech basis matrix from noisy speech significantly better than NMF alone or Wiener filtering followed by NMF. Furthermore, PESQ scores show that our algorithm is a viable choice for speech denoising.


doi: 10.21437/Interspeech.2014-514

Cite as: Vaz, C., Ramanarayanan, V., Narayanan, S.S. (2014) Joint filtering and factorization for recovering latent structure from noisy speech data. Proc. Interspeech 2014, 2365-2369, doi: 10.21437/Interspeech.2014-514

@inproceedings{vaz14b_interspeech,
  author={Colin Vaz and Vikram Ramanarayanan and Shrikanth S. Narayanan},
  title={{Joint filtering and factorization for recovering latent structure from noisy speech data}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={2365--2369},
  doi={10.21437/Interspeech.2014-514},
  issn={2308-457X}
}