摘要

Clustering of proteinprotein interaction networks is one of the most prevalent methods for identifying protein complexes, detecting functional modules and predicting protein functions. In the past few years, many clustering methods have been proposed. However, it is still a challenging task to evaluate how well the protein clusters are identified. Even for two of the most popular measurements, F-measure and p-value, bias exists when evaluating the identified clusters. In this paper, we propose two new types of measurements to evaluate clusters more finely and distinctly. One is hF-measureTf, a topology-free measurement and another is hF-measureTb, a topology-based measurement. Unlike F-measure, the new measurements of hF-measureTf and hF-measureTb can discriminate between different types of errors. Both artificial test data and practical test data were used to evaluate the effectiveness of hF-measureTf and hF-measureTb. For the artificial test data, artificial errors were generated by replacing some cluster members with functionally similar or non-similar members. The practical test data was produced by seven clustering algorithms Markov Clustering, Molecular Complex Detection, HC-PIN, SPICI, CPM, Core-Attachment and RRW. The experimental results on artificial and practical test data both show that hF-measureTf and hF-measureTb evaluate clusters more accurately compared to F-measure. Especially, hF-measureTb can capture the topology changes in clusters, which can also be used to the analysis of dynamic network.

  • 出版日期2013-1