摘要

In this paper, we want to identify the most significant genes that demonstrate the highest capabilities of discrimination between the classes of samples. Our method is based on Prim algorithm of minimal spanning tree. We use an improving SNR method-"information exponential" in order to remove the genes irrelevant to the classification task. And then minimum spanning trees, a graph-theoretic approach, was used in clustering gene expression data of molecular biology. We select significant candidate genes with the highest "information exponential" at each clustering gene. A support vector machine with radial basis function kernel is applied to validate the classification performance of the significant candidate genes selected for distinguishing different tissue types. The experimental results showed that our method produces impressive and competitive results in terms of classification.