摘要

A major task in understanding biological processes is to elucidate the relationships between genes involved in the underlying biological pathways. Microarray data from all increasing number of biologically interrelated experiments now allows for more complete portrayals of functional gene relationships in the pathways. In current studies of gene relationships, the presence of expression dependencies attributable to the biologically interrelated experiments, however, has been widely ignored. When unaccounted for, these (experimental) dependencies can result in inaccurate inferences of functional gene relationships, and hence incorrect biological conclusions. This article contributes a framework consisting of a model and an estimation procedure to infer gene relationships when there are two-day dependencies in the gene expression matrix (the gene-wise and experiment-wise dependencies). The main aspect of the framework is the use of a Kronecker product covariance matrix to model the gene-experiment interactions. The resulting novel gene coexpression measure, trained Knorm correlation. can be understood as a natural extension of the widely used Pearson coefficient when the experiment correlation matrix is known. Compared with the Pearson coefficient, the Knorm correlation has a smaller estimation variance. The Knorm is also asymptotically consistent with the Pearson coefficient. When the experiment correlation matrix is unknown, the Knorm correlation is computed based on the estimated experiment correlation matrix by an iterative estimation procedure. We demonstrate the advantages of the Knorm correlation in both simulation studies, and real datasets. The Knorm correlation estimation procedure is implemented in an R package (Knorm) that is freely available from the Bioconductor website.

  • 出版日期2009-6