摘要

Microarray datasets are enormous resource for studying expressional change in strongly-correlated genes and proteins in normal and tumor cells and thus leading to crucial biological knowledge at gene level that can help in pathogenesis of a disease. In general, a small count among the entire gene set gets perturbed under different experimental conditions in the tumor samples; only a subset of those actually exhibit correlated cDNA expressions. We propose a formalism and methodology for identifying those correlated expression groups both in the normal as well as the tumor samples and establishing one-to-one correspondence between a normal group and its corresponding tumor counterpart to obtain decisive knowledge on genetic perturbation. The concept is experimentally validated on (1) the large airway epithelial cells cDNA expressions from smokers with suspect lung cancer and (2) the lung tissue cDNA expressions from mouse with allergic asthma. It reports 98 cDNA expressions in the smokers' dataset that are severely mutated and thus abandoned their own colony of genes, possibly affecting normal biological functioning. Sixty genes from this set are substantiated through different literatures with clinical support. Same study on the asthma data reveals 369 significantly mutated expressions out of which 48 have been corroborated through existing literatures. The additional genes, even though not hypothesized yet, may play crucial role in mediating the disease.

  • 出版日期2013-12-1
  • 单位Microsoft