Building a Language-Independent Discourse Parser using Universal Networking Language

Navaneethakrishnan Subalalitha Chinnaudayar<sup>*</sup>; Parthasarathi Ranjani

doi:10.1111/coin.12037

摘要

Discourse parsing has become an inevitable task to process information in the natural language processing arena. Parsing complex discourse structures beyond the sentence level is a significant challenge. This article proposes a discourse parser that constructs rhetorical structure (RS) trees to identify such complex discourse structures. Unlike previous parsers that construct RS trees using lexical features, syntactic features and cue phrases, the proposed discourse parser constructs RS trees using high-level semantic features inherited from the Universal Networking Language (UNL). The UNL also adds a language-independent quality to the parser, because the UNL represents texts in a language-independent manner. The parser uses a naive Bayes probabilistic classifier to label discourse relations. It has been tested using 500 Tamil-language documents and the Rhetorical Structure Theory Discourse Treebank, which comprises 21 English-language documents. The performance of the naive Bayes classifier has been compared with that of the support vector machine (SVM) classifier, which has been used in the earlier approaches to build a discourse parser. It is seen that the naive Bayes probabilistic classifier is better suited for discourse relation labeling when compared with the SVM classifier, in terms of training time, testing time, and accuracy.

出版日期2015-11

全文

访问全文

收藏分享被引(2) 浏览

更新时间：2021-04-25 14:10

Building a Language-Independent Discourse Parser using Universal Networking Language

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友