A unit selection approach for voice transformation

Lee Ki Seung<sup>*</sup>

doi:10.1016/j.specom.2014.02.002

摘要

A voice transformation (VT) method that can make the utterance of a source speaker mimic that of a target speaker is described. Speaker individuality transformation is achieved by altering four feature parameters, which include the linear prediction coefficients cepstrum (LPCC), ALPCC, LP-residual and pitch period. The main objective of this study involves construction of an optimal sequence of features selected from a target speaker%26apos;s database, to maximize both the correlation probabilities between the transformed and the source features and the likelihood of the transformed features with respect to the target model. A set of two-pass conversion rules is proposed, where the feature parameters are first selected from a database then the optimal sequence of the feature parameters is then constructed in the second pass. The conversion rules were developed using a statistical approach that employed a maximum likelihood criterion. In constructing an optimal sequence of the features, a hidden Markov model (HMM) with global control variables (GCV) was employed to find the most likely combination of the features with respect to the target speaker%26apos;s model. %26lt;br%26gt;The effectiveness of the proposed transformation method was evaluated using objective tests and formal listening tests. We confirmed that the proposed method leads to perceptually more preferred results, compared with the conventional methods.

出版日期2014-5

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2021-04-25 13:02

A unit selection approach for voice transformation

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友