摘要

Periodic patterns and cyclic patterns have been used to discover recurring patterns in sequence databases. Toroslu (2003) proposed cyclically repeated pattern (CRP) mining, in which a new parameter called repetition support is considered in the mining process. In a data sequence, the occurrence of a subsequence must satisfy a single user-specified minimum repetition support. However, in real-life applications, items may occur at various frequencies in a database. The rare item problem may occur when all items are set to a single minimum repetition support. To solve this problem, we included the concept of multiple minimum supports to enable users to specify the multiple minimum item repetition support (MIR) according to the natures of items. In this paper, we first redefined CRPs based on the MIR and original form of the sequence minimum support. A new algorithm, rep-PrefixSpan, was developed for discovering a complete set of CRPs in sequence databases. The experimental results indicate that the proposed approach exhibits performance superior to that of conventional CRP mining. The proposed method can be applied in many application domains including customer purchase behavior, web logging, and stock analyses.