A Revised Method to Detect Erroneous Characters Wrongly Substituted, Deleted, and Inserted at the End Position in Japanese Sentences and 'Bunsetsu's

Araki Chikahiro<sup>*</sup>; Mori Mikio; Taniguchi Shuji

doi:10.1002/tee.20640

摘要

A method to detect the erroneous characters wrongly substituted, deleted, and inserted at the interior location of Japanese sentences and 'bunsetsu's using mth-order Markov chain model has been proposed earlier and was found to be useful in detecting these erroneous characters. However, with this method it is difficult to detect erroneous characters at the end position of Japanese sentences and 'bunsetsu's, because the Markov chain probabilities of erroneous characters at the end position of sentences and 'bunsetsu's, do not remain smaller than the critical value T the same number of times. This paper proposes a method to detect erroneous characters located at the end position of sentences and 'bunsetsu's using the 'skipped Markov chain model' in addition to the 'connected Markov chain model'. From experiments with newspaper articles, the proposed method is shown to be useful to correct erroneous characters located at the end position of sentences and 'bunsetsu's.

出版日期2011-3

全文

访问全文

收藏分享被引浏览

更新时间：2018-02-10 01:41

A Revised Method to Detect Erroneous Characters Wrongly Substituted, Deleted, and Inserted at the End Position in Japanese Sentences and 'Bunsetsu's

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友