A collocation-based WSD model: RFR-SUM

作者:Qu Weiguang*; Sui Zhifang; Ji Genlin; Yu Shiwen; Zhou Junsheng
来源:20th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 2007-06-26 to 2007-06-29.

摘要

In this paper, the concept of Relative Frequency Ratio (RFR) is presented to evaluate the strength of collocation. Based on RFR, a WSD Model RFR-SUM is put forward to disambiguate polysemous Chinese word sense. It selects 9 frequently used polysemous words as examples, and achieves the average precision up to 92.50% in open test. It has compared the model with Naive Bayesian Model and Maximum Entropy Model. The results show that the precision by RFR-SUM Model is 5.95% and 4.48% higher than that of Na:ive Bayesian Model and Maximum Entropy Model respectively. It also tries to prune RFR lists. The results reveal that leaving only 5% important collocation information can keep almost the same precision. At the same time, the speed is 20 times higher.