摘要

DNA microarrays help measure the expression levels of thousands of genes concurrently. A major challenge is to extract biologically relevant information and knowledge from massive amounts of microarray data. In this paper, we explore learning a compact representation of gene expression profiles by using a multi- task neural network model, so that further analyses can be carried out more efficiently on the data. The proposed network is trained with prediction tasks for Protein-Protein Interactions (PPIs), predicting Gene Ontology (GO) similarities as well as geometrical constrains, while simultaneously learning a high-level representation of gene expression data. We argue that deep networks can extract more information from expression data as compared to standard statistical models. We tested the utility of our method by comparing its performance with famous feature extraction and dimensionality reduction methods on the task of PPI prediction, and found the results to be promising.

全文