摘要

For traditional data mining techniques cannot be directly applied to the semi-structured XML data mining problem, this paper proposes a novel ontology and association rules based XML mining algorithm. The algorithm firstly introduces the domain ontology and hash technology to improve the operation of emerging frequent item sets and generating association rules, then uses a hash table to store the domain ontology, and at last the algorithm transforms the operation of the database into memory tree based on XML. Simulation results show that the algorithm can effectively reduce the size of XML documents and the association rules is easier to understand, so the advantages of the algorithm are shown.

全文