A novel outlier cluster detection algorithm without top-n parameter

作者:Huang, Jinlong; Zhu, Qingsheng*; Yang, Lijun; Cheng, DongDong; Wu, Quanwang
来源:Knowledge-Based Systems, 2017, 121: 32-40.
DOI:10.1016/j.knosys.2017.01.013

摘要

Outlier detection is an important task in data mining with numerous applications, including credit card fraud detection, video surveillance, etc. Outlier detection has been widely focused and studied in recent years. The concept about outlier factor of object is extended to the case of cluster. Although many outlier detection algorithms have been proposed, most of them face the top-n problem, i.e., it is difficult to know how many points in a database are outliers. In this paper we propose a novel outlier-cluster detection algorithm called ROCF based on the concept of mutual neighbor graph and on the idea that the size of outlier clusters is usually much smaller than the normal clusters. ROCF can automatically figure out the outlier rate of a database and effectively detect the outliers and outlier clusters without top-n parameter. The formal analysis and experiments show that this method can achieve good performance in outlier detection.