A concept lattice based outlier mining method in low-dimensional subspaces

作者:Zhang Jifu*; Jiang Yiyong; Chang Kai H; Zhang Sulan; Cai Jianghui; Hu Lihua
来源:Pattern Recognition Letters, 2009, 30(15): 1434-1439.
DOI:10.1016/j.patrec.2009.07.016

摘要

Traditional outlier mining methods identify outliers from a global point of view. It is usually difficult to find deviated data points in low-dimensional subspaces using these methods. The concept lattice, due to its straight-forwardness, conciseness and completeness in knowledge expression, has become an effective tool for data analysis and knowledge discovery. In this paper, a concept lattice based outlier mining algorithm (CLOM) for low-dimensional subspaces is proposed, which treats the intent of every concept lattice node as a subspace. First, sparsity and density coefficients, which measure outliers in low-dimensional subspaces. are defined and discussed. Second, the intent of a concept lattice node is regarded as a subspace, and sparsity subspaces are identified based on a predefined sparsity coefficient threshold. At this stage, whether the intent of any ancestor node of a sparsity subspace is a density subspace is identified based on a predefined density coefficient threshold. If it is a density subspace. then the objects in the extent of the node whose intent is a sparsity subspace are defined as outliers. Experimental results on a star spectral database show that CLOM is effective in mining outliers in low-dimensional subspaces. The accuracy of the results is also greatly improved.