摘要

Sequential pattern mining based on a bitmap can always perform faster calculations, but a large amount of candidate sequences are generated and tested, which leads to a large calculated amount. A new algorithm for effectively mining frequent sequential patterns, called CB-PMFS, is proposed in this paper. It scans the database once to form a compressed bitmap which records all the item positions in each sequence. Frequent items are mined first and candidate k-sequences are generated in pairs from frequent (kC1)-sequences while the bitmap is updated. Frequent sequences in the lists of each level are sorted to avoid unnecessary candidate sequence connections, and the mining process is converted from sequence matching to comparison between position values. CB-PMFS has been evaluated by experiments on both real and synthetic datasets, and the experimental result shows its high efficiency and good scalability.

  • 出版日期2014

全文