Using Multi-Instance Hierarchical Clustering Learning System to Predict Yeast Gene Function

作者:Liao, Bo*; Li, Yun; Jiang, Yan; Cai, Lijun
来源:PLos One, 2014, 9(3): e90962.
DOI:10.1371/journal.pone.0090962

摘要

Time-course gene expression datasets, which record continuous biological processes of genes, have recently been used to predict gene function. However, only few positive genes can be obtained from annotation databases, such as gene ontology (GO). To obtain more useful information and effectively predict gene function, gene annotations are clustered together to form a learnable and effective learning system. In this paper, we propose a novel multi-instance hierarchical clustering (MIHC) method to establish a learning system by clustering GO and compare this method with other learning system establishment methods. Multi-label support vector machine classifier and multi-label K-nearest neighbor classifier are used to verify these methods in four yeast time-course gene expression datasets. The MIHC method shows good performance, which serves as a guide to annotators or refines the annotation in detail.