Discovering frequent subtrees from XML data using neural networks

作者:Sun Wei; Liu Da Xin; Wang Tong
来源:Wuhan University Journal of Natural Sciences, 2006, 11(1): 117-121.
DOI:10.1007/BF02831715

摘要

By rapid progress of network and storage technologies, a huge amount of electronic data such as Web pages and XML has been available on Internet. In this paper, we study a data-mining problem of discovering frequent ordered sub-trees in a large collection of XML data, where both of the patterns and the data are modeled by labeled ordered trees. We present an efficient algorithm of Ordered Subtree Miner (OSTMiner) based on two-layer neural networks with Hebb rule, that computes all ordered sub-trees appearing in a collection of XML trees with frequent above a user-specified threshold using a special structure EM-tree. In this algorithm, EM-tree is used as an extended merging tree to supply scheme information for efficient pruning and mining frequent sub-trees. Experiments results showed that OSTMiner has good response time and scales well.

  • 出版日期2006

全文