摘要

Microarray data are expected to be useful for cancer classification. The main problem that needs to be addressed is the selection of a smaller subset of genes from the thousands of genes in the data that contributes to a cancer disease. This selection process is difficult due to many irrelevant genes, noisy data, and the availability of the small number of samples compared to the huge number of genes (higher-dimensional data). Hence, this paper aims to select a smaller subset of informative genes that is the most relevant for the cancer classification. To achieve the aim, a cyclic hybrid method has been proposed. Five real microarray data sets are used to test the effectiveness of the method. Experimental results show that the performance of the proposed method is superior to other experimental methods and related previous works in terms of classification accuracy and the number of selected genes. In addition, a scatter gene graph and a list of informative genes in the best gene subsets are also presented for biological usage.

  • 出版日期2009-8