摘要

Gene expression data and genotype variation data are now capable of providing genome-wide patterns across many different clinical conditions. However, the separate analysis of these data has limitations in elucidating the complex network of gene interactions underlying complex traits, such as common human diseases. More information about the identity of key driver genes of common diseases comes from integrating these two heterogeneous types of data. We developed a two-step procedure to characterize complex diseases by integrating genotype variation data and gene expression data. The first step elucidates the causal relationship among genetic variation, gene expression level, and disease. Based on the causal relationship determined at the first step, the second step identifies significant gene expression traits whose effects on disease status or whose responses to disease status are modified by the specific genotype variation. For the selected significant genes, a pathway enrichment analysis can be performed to identify the genetic mechanism of a complex disease. The proposed two-step procedure was shown to be an effective method for integrating three different levels of data, i.e., genotype variation, gene expression and disease status. By applying the proposed procedure to a chronic fatigue syndrome (CFS) dataset, we identified a list of potential causal genes for CFS, and found an evidence for difference in genetic mechanisms of the etiology between CFS without 'a major depressive disorder with melancholic features' (CFS) and CFS with 'a major depressive disorder with melancholic features' (CFS-MDD/m). Especially, the SNPs within NR3C1 gene were shown to differently influence the susceptibility of developing CFS and CFS-MDD/m through integrative action with gene expression levels.

  • 出版日期2009-10