ISCA Archive ISCSLP 1998
ISCA Archive ISCSLP 1998

Chinese Audiovisual Bimodal Speech Database CAVSR1.0

Yanjun Xu, Limin Du, Guoqiang Li, Peng Wu, Xin Zhang

To realize fast speaker adaptation in the case of limited adaptation data, we propose a fast speaker adaptation approach, called maximum likelihood smoothes and predictions. It smoothes and predicts target mean vectors based on their source mean vector by maximizing the likelihood of the smoothed model generating the adaptation data. So it can make best use of the first few adaptation data to quicken adaptation process. It increases the model’s prediction accuracy by off-line estimating regression matrices and on-line robustly estimating shift matrices. Moreover, it increases the model’s predictive power at mean vector level to obtain the estimators of more bad-adapted and no-adapted model parameters even with a few of adaptation data.