摘要

The schema of Web object often change, how to build the schema of Web entity effectively, and then provide the guidance for web data extraction and integration is a problem. Most of the existing approaches build the schema of Web entity once and for all, and cannot enrich the schema of Web entity dynamically. In this paper, an approach of building the schema of Web entity dynamically is proposed. The approach makes full use of those data accumulated in Web data integration to identify the new labels, which are showed in target Web pages. Then the schema of Web entity is enriched dynamically with the new discovered labels. Experimental results show that the proposed approach can build the schema of Web entity effectively. ? 2005 by Binary Information Press.

  • 出版日期2011

全文