A Linked Data Generation Method for Academic Conference Websites

作者:Wang Peng*; Zhou Mingqi; Zhang Xiang; Zhou Fengbo
来源:1st CCF Conference on Natural Language Processing and Chinese Computing, 2012-10-31 to 2012-11-05.

摘要

This paper proposes an automatic method for extracting information from academic conference Web pages, and organizes these information as ontologies, then matches these ontologies to the academic linked data. The main contributions include: (1) A page segmentation algorithm is proposed to divide conference Web pages into text blocks. (2) According to vision, key words and other text features, all text blocks are classified as 10 categories using bayes network model. The context information of text blocks are introduced to repair the initial classified results, which are improved to 96% precision and 98% recall. (3) An ontology is generated for each conference website, then all ontologies are matched as an academic linked data.

全文