Sparse kernel methods for high-dimensional survival data

作者:Evers Ludger*; Messow Claudia Martina
来源:Bioinformatics, 2008, 24(14): 1632-1638.
DOI:10.1093/bioinformatics/btn253

摘要

Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Coxs proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be kernelized. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, whereakin to support vector classificationthe margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches.