Multi-page Chinese expert metadata extraction model based on the fuzzy clustering

Pan Xiao; Shen Tao<sup>*</sup>; Yu Zhengtao; Mao Cunli; Yang Xiuzhen

doi:10.12733/jcis8644

摘要

This paper proposed a multi-page Chinese expert name, native place, organization, job title and research interesting extraction model based on the fuzzy clustering for the characteristic of relationships among the expert pages. First, words, parts of speech and expert page features are chosen, and using the Conditional Random Fields model extracts the 5 categories expert metadata from the single page that are recalled from retrieval. Then, the features of multi-page relationship are chosen, using the Maximum Entropy model constructs the page classification model to acquire the related page group of expert. Finally, using the method of fuzzy clustering and the related page group as guide information extracts more accurate expert metadata from multi-page. The 5 categories expert metadata extraction experiment is performed in nature language processing and machine learning domains, the result shows that using the expert metadata extraction model based on the fuzzy clustering can acquire better effect for extraction expert metadata, this model make the average accuracy of extracting 5 categories expert metadata increases 10% compared to the extraction method based on single page.

出版日期2014

全文

访问全文

收藏分享被引浏览

更新时间：2018-08-06 10:34

Multi-page Chinese expert metadata extraction model based on the fuzzy clustering

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友