摘要

This paper presents a POCS-based (projection on convex set) method that estimates the unobserved time-points in microarray time-series data to make such data useful for clustering and aligning. Unobserved values are caused either by missing values or by unevenly sampling rates, and cannot be estimated accurately by straightforward interpolation due to very noisy and few replicated data. According to prior knowledge that each gene time-series is constrained in both time and frequency domains, POCS formulates these constraints by multiple convex sets and uses an iteratively convergent procedure to Find the optimal value that satisfies all constraints by prior knowledge. To estimate the unobserved values, we use the cubic spline method to estimate the initial value and use POCS to rind the optimal value iteratively. We show that POCS can improve the estimation of unobserved time-points with lower normalized root mean squared error compared with the statistical spline estimation for the continuous representation of microarray time-series data. Theoretically, the POCS-based method may improve the estimation performance further if more prior knowledge is available.