Active Trace Clustering for Improved Process Discovery

作者:De Weerdt Jochen*; Broucke Seppe vanden; Vanthienen Jan; Baesens Bart
来源:IEEE Transactions on Knowledge and Data Engineering, 2013, 25(12): 2708-2720.
DOI:10.1109/TKDE.2013.64

摘要

Process discovery is the learning task that entails the construction of process models from event logs of information systems. Typically, these event logs are large data sets that contain the process executions by registering what activity has taken place at a certain moment in time. By far the most arduous challenge for process discovery algorithms consists of tackling the problem of accurate and comprehensible knowledge discovery from highly flexible environments. Event logs from such flexible systems often contain a large variety of process executions which makes the application of process mining most interesting. However, simply applying existing process discovery techniques will often yield highly incomprehensible process models because of their inaccuracy and complexity. With respect to resolving this problem, trace clustering is one very interesting approach since it allows to split up an existing event log so as to facilitate the knowledge discovery process. In this paper, we propose a novel trace clustering technique that significantly differs from previous approaches. Above all, it starts from the observation that currently available techniques suffer from a large divergence between the clustering bias and the evaluation bias. By employing an active learning inspired approach, this bias divergence is solved. In an assessment using four complex, real-life event logs, it is shown that our technique significantly outperforms currently available trace clustering techniques.

  • 出版日期2013-12