A two-stage prosodic structure generation strategy for Mandarin text-to-speech systems

Dong Yuan<sup>*</sup>; Zhou Tao; Dong Cheng Yu; Wang Hai La

doi:10.3724/SP.J.1004.2010.01569

摘要

Prosodic structure generation is the key component in improving the intelligibility and naturalness of synthetic speech for a text-to-speech (TTS) system. This paper investigates the problem of automatic segmentation of prosodic word and prosodic phrase, which are two fundamental layers in the hierarchical prosodic structure of Mandarin, and presents a two-stage prosodic structure generation strategy. Conditional random fields (CRF) models are built for both prosodic word and prosodic phrase prediction at the front end with different feature selections. Besides, a transformation-based error-driven learning (TBL) modification module is introduced in the back end to amend the initial prediction. Experiment results show that the approach combining CRF and TBL achieves an F-score of 94.66%.

出版日期2010
单位北京大学

全文

访问全文

收藏分享被引浏览

更新时间：2018-08-06 15:02

A two-stage prosodic structure generation strategy for Mandarin text-to-speech systems

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友