摘要

In order to improve the accuracy of error correction for Chinese placename, the cleansing framework based on the address-name index table and bi-direction placename error correction method are proposed. Firstly, the error correction algorithms in data cleansing area are reviewed. Secondly, the cleansing framework based on the address-name index table is proposed for the features of Chinese placename and the low accuracy of existing algorithms. In this framework, the structure of address-name index table and relevant concepts are defined, and a new bi-direction placename error correction method is used to improve the accuracy of algorithm by using the address-name index table. In this method, the concepts of partial similarity and whole similarity are introduced, and Chinese placename are matched and corrected in forward firstly, then they are searched from address-name index table and corrected in reverse. Finally, extensive simulation experiments are conducted to prove the feasibility and rationality of method, the results of which show that the bi-direction placename error correction method is better than the others in the performance metrics of implementation precision ratio and recall ratio for error correction of Chinese placename.

  • 出版日期2013

全文