摘要

A new partial least squares (PLS) weighting Gaussian process (PWGP) algorithm is proposed to improve the regression performance of Gaussian process (GP), an outstanding kernel-based machine learning method, on high dimensional data with small sample size, especially near-infrared (NIR) spectroscopy. Important indexes of original variables are firstly calculated according to their contributions to the PLS regression model. After being weighted by these important indexes, new values of the observations are input into GP algorithm for further regression analysis. Relying on the PLS based weighting technique, important variables could be highlighted by their relatively large index values. Consequently "information saturation" phenomenon could be successfully overcome. Most importantly, unlike other weighting methods, there is no need to have prior knowledge in order to optimize any factors or parameters, thus the PWGP method is especially suitable for regression problems of "black-box" systems. Applications of the proposed method on three NIR spectroscopy dataset, which are widely used as test data, strongly confirmed that the predictive performance of PWGP is superior to other approaches.

全文