Sparse partial least-squares regression and its applications to high-throughput data analysis

作者:Lee Donghwan; Lee Woojoo; Lee Youngjo; Pawitan Yudi*
来源:Chemometrics and Intelligent Laboratory Systems, 2011, 109(1): 1-8.
DOI:10.1016/j.chemolab.2011.07.002

摘要

The partial least-squares (PLS) method is designed for prediction problems where the number of predictors is larger than the number of training samples. PIS is based on latent components that are linear combinations of all of the original predictors, so it automatically employs all predictors regardless of their relevance. This will potentially compromise its performance, but it will also make it difficult to interpret the result. In this paper, we propose a new formulation of the sparse PIS (SPLS) procedure to allow both sparse variable selection and dimension reduction. We use the standard L(1)-penalty and the unbounded penalty of [1]. We develop a computing algorithm for SPLS by modifying the nonlinear iterative partial least-squares (NIPALS) algorithm, and illustrate the method with an analysis of a cancer dataset. Through the numerical studies we find that our SPLS method generally performs better than the standard PIS and other existing methods in variable selection and prediction.

  • 出版日期2011-11-15