ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Speaker adaptation based on sparse and low-rank eigenphone matrix estimation

Wen-Lin Zhang, Dan Qu, Wei-Qiang Zhang, Bi-Cheng Li

The eigenphone based speaker adaptation outperforms the conventional MLLR and eigenvoice methods when the adaptation data is sufficient, but it suffers from severe over-fitting when the adaptation data is limited. In this paper, l1 and nuclear norm regularization are applied simultaneously to obtain a more robust eigenphone estimation, resulting in a sparse and low-rank eigenphone matrix. The sparse constraint can reduce the number of free parameters while the low rank constraint can limit the dimension of phone variation subspace, which are both benefit to the generalization ability. Experimental results show that the proposed method can improve the adaptation performance substantially, especially when the amount of adaptation data is limited.


doi: 10.21437/Interspeech.2014-496

Cite as: Zhang, W.-L., Qu, D., Zhang, W.-Q., Li, B.-C. (2014) Speaker adaptation based on sparse and low-rank eigenphone matrix estimation. Proc. Interspeech 2014, 2972-2976, doi: 10.21437/Interspeech.2014-496

@inproceedings{zhang14g_interspeech,
  author={Wen-Lin Zhang and Dan Qu and Wei-Qiang Zhang and Bi-Cheng Li},
  title={{Speaker adaptation based on sparse and low-rank eigenphone matrix estimation}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={2972--2976},
  doi={10.21437/Interspeech.2014-496},
  issn={2308-457X}
}