摘要

Researches on the etiology and pathogenesis of prostate cancer are helpful for disease diagnosis and treatment. However, current biochemical experimental methods for prostate cancer are both costly and time-consuming, as well as networks based methods for this disease analysis limited by the nature of gene expression profiles for its incomplete, high noise and small sample size. Therefore, we proposed a dual constraint algorithm based on the confidence of one vertices belonging to the community and local modularity, named as NMCOM, to mine the candidate disease modules of prostate cancer in the present work. The NMCOM algorithm is gene expression independent method. It first integrated the concordance scores between the candidate genes and the causative phenotypes, as well as the semantic similarity scores between the candidate genes and the causative genes for prioritizing the candidate genes together, and then the starting node is selected with a sorting strategy. Finally, the candidate modules of prostate cancer are mined with dual constraint produces constructing on the confidence between node and module, as well as local modularity. 18 significant candidate disease gene modules were detected for the enrichment analysis of the obtained modules. Compared with the single scoring sorting methods and random walk with restart, the NMCOM fusion prioritizing strategy achieved a smaller MRR (Mean Rank Ratio) but bigger AUC value. The results are significantly better than other modules-based mining algorithms, and the biological explanations for these mined modules are more significant. More importantly, the NMCOM algorithm can be easily extended to mine any other diseases candidate modules.