Accent Sandhi Estimation of Tokyo Dialect of Japanese Using Conditional Random Fields

作者:Suzuki Masayuki*; Kuroiwa Ryo; Innami Keisuke; Kobayashi Shumpei; Shimizu Shinya; Minematsu Nobuaki; Hirose Keikichi
来源:IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D(4): 655-661.
DOI:10.1587/transinf.2016AWI0004

摘要

When synthesizing speech from Japanese text, correct assignment of accent nuclei for input text with arbitrary contents is indispensable in obtaining naturally-sounding synthetic speech. A phenomenon called accent sandhi occurs in utterances of Japanese; when a word is uttered in a sentence, its accent nucleus may change depending on the contexts of preceding/succeeding words. This paper describes a statistical method for automatically predicting the accent nucleus changes due to accent sandhi. First, as the basis of the research, a database of Japanese text was constructed with labels of accent phrase boundaries and accent nucleus positions when uttered in sentences. A single native speaker of Tokyo dialect Japanese annotated all the labels for 6,344 Japanese sentences. Then, using this database, a conditional-random-field-based method was developed using this database to predict accent phrase boundaries and accent nuclei. The proposed method predicted accent nucleus positions for accent phrases with 94.66% accuracy, clearly surpassing the 87.48% accuracy obtained using our rule-based method. A listening experiment was also conducted on synthetic speech obtained using the proposed method and that obtained using the rule-based method. The results show that our method significantly improved the naturalness of synthetic speech.

  • 出版日期2017-4
  • 单位IBM