A latent discriminative variable model for automatic identification of Chinese base phrases

Sun Xiao<sup>*</sup>; Nan Xiaoli

摘要

In the fields of natural language processing such as information processing and machine translation, recognizing simple and non-recursive Chinese base phrases is an important task. In stead of rule-based model, we adopt the statistical machine learning method, newly proposed Latent semi-CRF model to solve the Chinese base phrase chunking problem. The Chinese base phrases could be treated as the sequence labeling problem, which involve the prediction of a class label for each frame in an unsegmented sequence. The Chinese base phrases have sub-structures which could not be observed in training data. Latent semi-CRF, which incorporates the advantages of Latent Dynamic Conditional Random Fields and semi-CRF that model the sub-structure of a class sequence and learn dynamics between class labels, in detecting the Chinese base phrases. Our results demonstrate that the latent dynamic discriminative model compares favorably to Support Vector Machines, Maximum Entropy Model, and Conditional Random Fields (including LDCRF and semi-CRF) on Chinese base phrases chunking.

出版日期2010
单位大连民族大学

全文

访问全文

收藏分享被引浏览

更新时间：2018-08-06 16:11

A latent discriminative variable model for automatic identification of Chinese base phrases

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友