摘要

Several algorithms are proposed in the literature for extracting local patterns from a large data matrix. This technique of data mining is known as biclustering. Each of the biclustering algorithms is specialised in extracting different kinds of biclusters. Some algorithms detect equal biclusters, whereas some identify scaled biclusters (Madeira et al., 2004). For any practical database, since we are not aware of the biclusters present in it, we are not sure of the biclustering algorithm to be used. In such a scenario, it is important to define metrics to compare the quality of the extracted biclusters and hence the quality of the biclustering algorithm. In this paper, we have defined novel measures of Hausdorff distance between biclusters and global silhouette index for estimating the quality of biclusters extracted by the existing algorithms. We have also combined these metrics with the proportion of enriched biclusters extracted and defined an overall index defined as the Fuzzy Biclustering Index (FBI) to compare the various algorithms. For a given data set, higher is the FBI, better is the biclustering algorithm.

  • 出版日期2016

全文