摘要

As a valuable tool for text understanding, semantic similarity measurement enables discriminative semantic-based applications in the fields of natural language processing, information retrieval, computational linguistics and artificial intelligence. Most of the existing studies have used structured taxonomies such as WordNet to explore the lexical semantic relationship, however, the improvement of computation accuracy is still a challenge for them. To address this problem, in this paper, we propose a hybrid WordNet-based approach CSSM-ICSP to measuring concept semantic similarity, which leverage the information content(IC) of concepts to weight the shortest path distance between concepts. To improve the performance of IC computation, we also develop a novel model of the intrinsic IC of concepts, where a variety of semantic properties involved in the structure of WordNet are taken into consideration. In addition, we summarize and classify the technical characteristics of previous WordNet-based approaches, as well as evaluate our approach against these approaches on various benchmarks. The experimental results of the proposed approaches are more correlated with human judgment of similarity in term of the correlation coefficient, which indicates that our IC model and similarity detection approach are comparable or even better for semantic similarity measurement as compared to others.