摘要

Synonyms are crucial resources for many semantic applications, and the issue of synonym extraction has been studied extensively. However, extraction accuracy still cannot meet the practical demands. In addition, manually refining extraction results is time consuming. This article focuses on refining synonym extraction results by cleaning and ranking. A new graph model, the synonym graph, is proposed for the purpose of transforming the synonym extraction result of each word into a directed graph. Following this, two approaches for refining synonym extraction results are proposed based on the synonym graph. The first approach divides each extraction result into two parts - synonyms and noise - and detects noise by analysing the connectivity of the synonym graph. The second approach ranks the words in each extraction result by computing their semantic distance in the synonym graph. This approach was found to be more flexible than the first. The results of the experiments conducted in this study indicate that the performance of both of our proposed approaches is effective. In particular, they were found to perform well with datasets containing large synonym extraction results, which is important to reducing the cost of refining.