摘要

An important issue in the design of gene selection algorithm for microarray data analysis is the formation of suitable criterion function for measuring the relevance between different gene expressions. Mutual information (MI) is a widely used criterion function but it calculates the relevance on the entire samples only once which cannot exactly identify the informative genes. This paper proposes a novel idea of computing MI in stages. The proposed multistage mutual information (MSMI) computes MI, initially using all the samples and based on the classification performance produced by artificial neural network (ANN), MI is repeatedly calculated using only the unclassified samples until there is no improvement in the classification accuracy. The performance of the proposed approach is evaluated using ten gene expression data sets. Simulation result shows that the proposed approach helps to improve the discriminate power of the genes with regard to the target disease of a microarray sample. Statistical analysis of the test result shows that the proposed method selects highly informative genes and produces comparable classification accuracy than the other approaches reported in the literature.

  • 出版日期2011-12

全文