摘要

Sequential pattern mining front Sequence databases has been recognized as in important data mining problem with various applications Items in a sequence database call be organized into a concept hierarchy according to taxonomy Based oil the hierarchy, sequential patterns call be found not only at the leaf nodes (Individual items) of the hierarchy. but also at higher levels of the hierarchy, this is called multiple-levels sequential pattern mining. In previous research. taxonomic based oil crisp relationships between ally two disjointed levels. however, cannot handle the uncertainties and fuzzy in real life. For example, Tomatoes Could be classified into the Fruit category. but could be also regarded I,; the Vegetable category. To deal with the fuzzy nature of taxonomy, Chen and Huang developed a novel knowledge discovering model to mine fuzzy multi-level sequential patterns, where the relationships from one level to another can be represented by a Value between 0 and 1 their work, I generalized sequential patterns (GSP)-like algorithm was developed to find fuzzy multi-level sequential patterns This algorithm, however. faces I difficult problem since the mining process may have to generate and examine I huge set of combinatorial subsequences and requires multiple Scans of the database. In this paper, we propose a new efficient algorithm to mine this type of pattern based on the divide-and-conquer-strategy. In addition. another efficient algorithm is developed to discover fuzzy cross-level sequential patterns. Since the proposed algorithm greatly reduces the candidate Subsequence generation efforts. the performance is Improved significantly Experiments show that the proposed algorithm IS Much more efficient and scalable than the previous one.

  • 出版日期2009-12-1