A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks

Mei, Suyu<sup>*</sup>; Zhu, Hao

doi:10.1038/srep08034

摘要

Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus.

出版日期2015-1-26
单位沈阳师范大学; 南方医科大学

全文

访问全文

收藏分享被引(23) 浏览

更新时间：2024-04-03 21:33

A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友