
Sequence clustering is significant for analyzing software fault. The existing similarities of sequence are inexact for measuring software fault. In this paper, a sequence clustering algorithm for analyzing software fault feature is proposed. Firstly, a new similarity of fault sequence and the sequence entropy are defined. Secondly, the sequence with the smallest entropy is selected as the cluster center, and then the clusters are obtained based on the largest similarity between the unselected sequence and the cluster center. The optimal number of clusters is determined by the average silhouette coefficient. Lastly, the sequences to be analyzed are classed into the most similar cluster to analyze the fault type. Experimental results show that this method has a lower false positive rates and false negative rates.

  • 出版日期2014
