摘要

Manifold learning is a new kind of algorithm originating from the field of machine learning to find the intrinsic dimensionality of numerous and complex data and to extract most important information from the raw data to develop a regression or classification model. The basic assumption of the manifold learning is that the high-dimensional data measured from the same object using some devices must reside on a manifold with much lower dimensions determined by a few properties of the object. While NIR spectra are characterized by their high dimensions and complicated band assignment, the authors may assume that the NIR spectra of the same kind of substances with different chemical concentrations should reside on a manifold with much lower dimensions determined by the concentrations. according to the above assumption. As one of the best known algorithms of manifold learning, locally linear embedding (LLE) further assumes that the underlying manifold is locally linear. So, every data point in the manifold should be a linear combination of its neighbors. Based on the above assumptions, the present paper proposes a new algorithm named least square locally weighted regression (LS-LWR), which is a kind of LWR with weights determined by the least squares instead of a predefined function. Then, the NIR spectra of glucose solutions with various concentrations are measured using a NIR spectrometer and LS-LWR is verified by predicting the concentrations of glucose solutions quantitatively. Compared with the existing algorithms such as principal component regression (PCR) and partial least squares regression (PLSR), the LS-LWR has better predictability measured by the standard error of prediction (SEP) and generates an elegant model with good stability and efficiency.