摘要

This paper proposes a system, named Del-Explorer, which analyzes the type of given Chinese terms, extracts term definitions from the Web, and selects answers from noisy Web pages. DefExplorer tillers out invalid data with a semantic approach. Two types of candidate sets, common and domain specific, are employed to cluster similar candidates into groups. Different approaches are also deployed to evaluate candidates' importance which is the key factor for selecting the best answers from retrieved candidates. Experimental results show that DefExplorer can effectively extract term definitions from the Web, especially for the definitions of out-of-vocabulary terms.

  • 出版日期2010-3