A general, prediction error-based criterion for selecting model   complexity for high-dimensional survival models

Porzelius Christine<sup>*</sup>; Schumacher Martin; Binder Harald

doi:10.1002/sim.3765

摘要

When fitting predictive survival models to high-dimensional data, an adequate criterion for selecting model complexity is needed to avoid overfitting. The complexity parameter is typically selected by the predictive partial log-likelihood (PLL) estimated via cross-validation. As an alternative criterion, we propose a relative version of the integrated prediction error curve (IPEC), which can be stably estimated via bootstrap resampling. The IPEC has the advantage of being applicable for models and fitting techniques where the PLL is not available. To investigate the performance of this new criterion, a simulation study is carried out, mimicking microarray survival data. Additionally, model selection by predictive PLL, estimated via bootstrap resampling instead of cross-validation, is examined. It is seen that this mostly results in similar prediction performance of the selected models, compared to estimates based on cross-validation. Model selection by bootstrap estimates of the IPEC performs about as well as selection by cross-validation estimates of the PLL. Therefore, it is expected to be a reasonable alternative in cases where there is no PLL. Similar results are seen in the analysis of a microarray survival data set from patients with diffuse large-B-cell lymphoma.

出版日期2010-4
单位河北医科大学

全文

访问全文

收藏分享被引(2) 浏览

更新时间：2018-08-02 15:29

A general, prediction error-based criterion for selecting model complexity for high-dimensional survival models

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友