摘要

Maximum likelihood linear regression (MLLR) transforms have proven useful for text-independent speaker recognition systems These systems use the parameters of MLLR transforms as features for SVM modeling and classification In this paper, we focus on calculating affine transforms based on a GMM Universal background model (UBM) Rather than estimating transforms using maximum likelihood criterion, we propose to use Maximum a posteriori linear regression (MAPLR) for feature extraction This work is enriched by a multi-class technique, which clusters the Gaussian mixtures into regression classes and estimates a different transform for each class The transforms of all classes are concatenated into a supervector for SVM classification Besides, a further accuracy boost is obtained by combining supervectors derived from both female and male UBMs into a larger supervector Experiments on a NIST 2008 SRE corpus show that the MAPLR system outperforms MLLR and the multi-class approaches can also bring significant gains