Automatic Wordnet Development for Low-Resource Languages using Cross-Lingual WSD

作者:Taghizadeh Nasrin*; Faili Hesham
来源:Journal of Artificial Intelligence Research, 2016, 56: 61-87.
DOI:10.1613/jair.4968

摘要

Wordnet is an effective resource in natural language processing and information retrieval , especially for semantic processing and meaning related tasks. So far wordnet has been constructed in many languages. However, automatic development of wordnet for low-resource languages has not been studied well. In this paper an Expectation-Maximization algorithm is used to train high quality and large scale wordnet for resource-poor languages. The proposed method benefits from cross-lingual word sense disambiguation and develops a wordnet just using a bilingual dictionary and a monolingual corpus. The proposed method has been executed on Persian as a resource-poor language and the resulting wordnet has been evaluated through several experiments. Results show that the induced wordnet has a precision of 90% and recall of 35%.

  • 出版日期2016