摘要

The well known clustering algorithm DBSCAN is founded on the density notion of clustering. However, the use of global density parameter epsilon-distance makes DBSCAN not suitable in varying density datasets. Also, guessing the value for the same is not straightforward. In this paper, we generalise this algorithm in two ways. First, adaptively determine the key input parameter epsilon-distance, which makes DBSCAN independent of domain knowledge satisfying the unsupervised notion of clustering. Second, the approach of deriving epsilon-distance based on checking the data distribution of each dimension makes the approach suitable for subspace clustering, which detects clusters enclosed in various subspaces of high dimensional data. Experimental results illustrate that our approach can efficiently find out the clusters of varying sizes, shapes as well as varying densities.

  • 出版日期2014-5