摘要

Reconstruction of large scale gene regulatory networks (GRNs in the following) is an important step for understanding the complex regulatory mechanisms within the cell. Many modeling approaches have been introduced to find the causal relationship between genes using expression data. However, they have been suffering from high dimensionality-large number of genes but a small number of samples, overfitting, heavy computation time and low interpretability. We have previously proposed an original Data Mining algorithm LICORN, that infers cooperative regulation network from expression datasets. In this work, we present an extension of LICORN to a hybrid inference method H-LICORN that uses search in both discrete and real valued spaces. LICORN%26apos;s algorithm, using the discrete space to find cooperative regulation relationships fitting the target gene expression, has been shown to be powerful in identifying cooperative regulation relationships that are out of the scope of most GRN inference methods. Still, as many of related GRN inference techniques, LICORN suffers from a large number of false positives. We propose here an extension of LICORN with a numerical selection step, expressed as a linear regression problem, that effectively complements the discrete search of LICORN. We evaluate a bootstrapped version of H-LICORN on the in silico DREAM5 dataset and show that H-LICORN has significantly higher performance than LICORN, and is competitive or outperforms state of the art GRN inference algorithms, especially when operating on small data sets. We also applied H-LICORN on a real dataset of human bladder cancer and show that it performs better than other methods in finding candidate regulatory interactions. In particular, solely based on gene expression data, H-LICORN is able to identify experimentally validated regulator cooperative relationships involved in cancer.

  • 出版日期2014-6