摘要

Recent large-scale genome-wide association (GWA) studies of SNP variations captured many thousands individual genetic profiles of H. sapiens and facilitated identification of significant genetic traits which are highly likely to influence the pathogenesis of several major human diseases. Here we apply the integrative genomics principles to interrogate relationships between structural features and gene expression patterns of disease-linked SNPs, microRNAs and mRNAs of protein-coding genes in association to phenotypes of 15 major human disorders, namely bipolar disease (BD); rheumatoid arthritis (RA); coronary artery disease (CAD); Crohn's disease (CD); type 1 diabetes (T1D); type 2 diabetes (T2D); hypertension (HT); ankylosing spondylitis (AS); Graves' disease (autoimmune thyroid disease; AITD); multiple sclerosis (MS); breast cancer (BC); prostate cancer (PC); systemic lupus erythematosus (SLE); vitiligo-associated multiple autoimmune disease (VIT); and ulcerative colitis (UC). We selected for sequence homology profiling a set of similar to 250 SNPs which were unequivocally associated with common human disorders based on multiple independent studies of 220,124 individual samples comprising 85,077 disease cases and 129,506 controls. Our analysis reveals a systematic primary sequence homology/complementarity-driven pattern of associations between disease-linked SNPs, microRNAs and protein-coding mRNAs defined here as a human disease phenocode. We utilize this approach to draw SNP-guided microRNA maps of major human diseases and define a consensus disease phenocode for fifteen major human disorders. A consensus disease phenocode comprises 72 SNPs and 18 microRNAs with an apparent propensity to target mRNA sequences derived from a single protein-coding gene, KPNA1. Each of microRNAs in this elite set appears linked to at least three common human diseases and has potential protein-coding mRNA targets among the principal components of the nuclear import pathway. We confirmed the validity of our findings by analyzing independent sets of most significant disease-linked SNPs and demonstrating statistically significant KPNA1-gene expression phenotypes associated with human genotypes of CD, BD, T2D and RA populations. Our analysis supports the idea that variations in DNA sequences associated with multiple human diseases may affect phenotypes in trans via non-protein-coding RNA intermediaries interfering with functions of microRNAs and defines the nuclear import pathway as a potential major target in 15 common human disorders.

  • 出版日期2008-8-15