摘要

As the high dimensional data stream has the property of large volume, arriving rapidly, etc, many clustering algorithms cannot obtain the better clustering quality. We propose a clustering algorithm based on fractal and grid over high dimensional data stream. First of all, the fractal based feature attribute selection method is described. One attribute is put into the feature attributes set, if its part fractal dimension increases one or more than one, the attribute is regarded as the feature attribute. Then, the clustering algorithm is given. The high dimensional data space will be separated into many grids. The information of grids will be updated as time going. While a clustering request comes, the high dimensional grids should be projected to the subspace formed by feature attributes. And the grid based method is used to get the initialized clusters. For the rest grids which are not clustered in the initialization process, they will be handled by fractal based clustering method. The experiment results turn out that the HDFG (a High Dimensional stream clustering based on Fractal and Grid) has highly precision.

  • 出版日期2012

全文