摘要

An important issue in clustering is the automatic determination of a number of clusters close to the true one. The aim of this paper is to revisit a method called density of points clustering (DPC) that tackles this problem by comparing the density inside a cluster and between two potential sub-clusters. Light is shed on the geometric probability aspect of this method by giving a closed-form formula on the probability distribution of the points generated by picking two points inside a p-dimensional ball (ball segment picking) and taking the middle of them. This sampling procedure is indeed at the heart of DPC. The result shows that such sampled points tend to be more concentrated towards the ball center than the uniform sampled points. The contribution of this study is to explain why DPC can produce good results.

  • 出版日期2011-4-1