摘要

Mining coexpression clusters across multiple datasets is a major approach for identifying transcription modules in systems biology. The main difficulty of this problem lies in the fact that these subgraphs are buried among huge irrelevant connections. In this paper, we address this problem using a noise reduction strategy. It consists of three processes: (1) Coarse filtering; (2) Clustering potential subsets of graphs; (3) Refined filtering on those subsets. Using yeast as a model system, we demonstrate that most of the gene clusters derived from our method are enrichment clusters. That is they are likely to be functional homogenous entities or potential transcription modules.

全文