An Exemplar-Based Approach to Frequency Warping for Voice Conversion

作者:Tian Xiaohai*; Lee Siu Wa; Wu Zhizheng; Chng Eng Siong; Li Haizhou
来源:IEEE/ACM Transactions on Audio Speech and Language Processing, 2017, 25(10): 1863-1876.
DOI:10.1109/TASLP.2017.2723721

摘要

The voice conversion's task is to modify a source speaker's voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine frequency warping and exemplar-based method for voice conversion. Our method maintains high-resolution details during conversion by directly applying frequency warping on the high-resolution spectrum to represent the target. The warping function is generated by a sparse interpolation from a dictionary of exemplar warping functions. As the generated warping function is dependent only on a very small set of exemplars, we do away with the statistical averaging effects inherited from Gaussian mixture models. To compensate for the conversion error, we also apply residual exemplars into the conversion process. Both objective and subjective evaluations on the VOICES database validated the effectiveness of the proposed voice conversion framework. We observed a significant improvement in speech quality over the state-of-the-art parametric methods.

  • 出版日期2017-10
  • 单位南阳理工学院