Sparse Biclustering of Transposable Data

Tan Kean Ming<sup>*</sup>; Witten Daniela M

doi:10.1080/10618600.2013.852554

摘要

We consider the task of simultaneously clustering the rows and columns of a large transposable data matrix. We assume that the matrix elements are normally distributed with a bicluster-specific mean term and a common variance, and perform biclustering by maximizing the corresponding log-likelihood. We apply an l(1) penalty to the means of the biclusters to obtain sparse and interpretable biclusters. Our proposal amounts to a sparse, symmetrized version of k-means clustering. We show that k-means clustering of the rows and of the columns of a data matrix can be seen as special cases of our proposal, and that a relaxation of our proposal yields the singular value decomposition. In addition, we propose a framework for biclustering based on the matrix-variate normal distribution. The performances of our proposals are demonstrated in a simulation study and on a gene expression dataset. This article has supplementary material online.

出版日期2014-10-2

全文

访问全文

收藏分享被引(32) 浏览

更新时间：2024-04-15 22:59

Sparse Biclustering of Transposable Data

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友