A comparative study of clustering methods for molecular data

作者:Wang Lin*; Jiang Minghu; Lu Yinghua; Sun Minfu; Noe Frank
来源:International Journal of Neural Systems, 2007, 17(6): 447-458.
DOI:10.1142/S0129065707001287

摘要

The research aim is to use three clustering technologies for establishing molecular data model of large size sets by comparison between low energy samples (LES) and local molecular samples (LMS). Hierarchical cluster of multi-level tree distance relation, competitive learning network of similar inputs falling into the same cluster and topological SOM are used to analyze 6242 LES and 5000 LMS. Our experiments show that in SOM, there are 24 to 25 Davies-Boulding clustering index and color map cluster units in the LES more than 10 to 12 in the LMS, which is consistent with the results of hierarchical cluster and competitive learning network in the rough. The hierarchical cluster reflects the biggest inter-cluster distance about 30 for the LES is far larger than that of LMS about 10. The intra-cluster distance of LES about 15 is also far bigger than that of LMS about 3. In SOM, there are more cluster borders of high values (black) reflecting large distance and more clusters in the D-matrix and U-matrix of LES than that of LMS, due to the biggest standard deviation range from -8 to 10 of samples feature of the LES is bigger than that of LMS from -2.5 to 2.5.