摘要

The grass family has been the subject of intense research over the past. Reliable and fast classification / sub-classification of large sequences which are rapidly gaining importance due to genome sequencing projects all over the world is contributing large amount of genome sequences to public gene bank . Hence sequence classification has gained importance for predicting the genome function, structure, evolutionary relationships and also gives the insight into the features associated with the biological role of the class. Thus, classification of functional genome is an important andchallenging task to both computer scientists and biologists. The presence of motifs in grass genome chains predicts the functional behavior of the grass genome. The correlation between grass genome properties and their motifs is not always obvious since more than one motif may exist within a genome chain. Due to the complexity of this association most of the data mining algorithms are either non efficient or time consuming. Hence, in this paper we proposed an efficient method for main classes based on classes to reduce the time complexity for the classification of large sequences of grass genomes dataset. The proposed approaches classify the given dataset into classes with conserved threshold and again reclassify the class relaxed threshold into major classes. Experimental results indicate that the proposed method reduces the time complexity keepingclassification accuracy level as that compared with general NNCalgorithm.

  • 出版日期2010

全文