A new approach for the deep order preserving submatrix problem based on sequential pattern mining

作者:Xue, Yun*; Li, Tiechen; Liu, Zhiwen; Pang, Chaoyi; Li, Meihang; Liao, Zhengling; Hu, Xiaohui
来源:International Journal of Machine Learning and Cybernetics, 2018, 9(2): 263-279.
DOI:10.1007/s13042-015-0384-z

摘要

As an effective biclustering model, order-preserving submatrix (OPSM) has been widely applied to biological gene expression data mining, which can capture the general tendency of the gene expression under some experimental conditions. Recently, biologists hope to find deep OPSMs with long patterns and comparatively fewer support rows, which are important for the interpretation of gene regulatory networks. However, the traditional exact mining algorithms based on Apriori principle could not deal with the deep OPSM problem, which often take a large minimum support threshold for pattern pruning, and inevitably miss some significant deep OPSMs. In this paper, a new exact algorithm is proposed for mining deep OPSMs. Firstly all the common subsequences shared by every two rows are found out, then the row sets corresponding to the same common subsequences are formed. Finally all the deep OPSMs with support beyond the given threshold will be obtained. Experiments have been done in both real and synthetic data sets, and the results show that this new algorithm is capable of mining all the deep OPSMs over a small support. Under different thresholds of minimum support, this algorithm reveals better performance than the traditional sequential pattern mining algorithms.

  • 出版日期2018-2
  • 单位CSIRO