摘要

This paper proposes a hierarchical word domain assignment algorithm to automatically build domain dictionaries from Machine-Readable Dictionary (MRD). The process for word domain assignment can be divided into three steps: 1) Hierarchical structure constructing; 2) Classifier training; 3) Word domain assigning. Compared with the traditional methods, the hierarchical word domain assignment algorithm enhances the accuracy of word domain assignment while reducing human efforts on collecting corpus. Experiments on WordNet 2. 0 show that 62.53% of the first domain labels are matched with the WordNet Domains 3.0 by using gloss-based word domain assignment, and the performance can be further improved by utilizing the hierarchical relationships among the domain sets.