摘要

In some pattern recognition problems, the syntactic or structural information that describes each pattern is important. A syntactic pattern can be described using string grammar. There is only a handful of research works involving with the string grammar clustering. The string grammar hard C-means (sgHCM) is one of the most well-known clustering algorithms in syntactic pattern recognition. Since, it has been proved that fuzzy clustering is better than a hard clustering, a string grammar fuzzy C-medians (sgFCMed) algorithm to improve the sgHCM was previously proposed. However, the sgFCMed may not provide a good clustering result for any application with an overlapping data. Thus, in this paper, a string grammar fuzzy-possibilistic C-medians (sgFPCMed) algorithm is introduced to cope with the overlapping data problem. The proposed algorithm is implemented on four real overlapping data sets, i.e., MPEG-7 data set, Copenhagen chromosomes data set, MNIST database of handwritten digits, and USPS database of handwritten digits. The proposed sgFPCMed results are compared with the results from the sgHCM and the sgFCMed. The results show that the proposed sgFPCMed is better than both. The proposed sgFPCMed algorithm results are directly and indirectly compared with the results from other syntactic or numeric methods The proposed sgFPCMed is better than some approaches and comparable to some of the methods However, since the proposed sgFPCMed is a string grammar clustering, it is easier to transform each prototype string back into the original form of data set.

  • 出版日期2017-8