A non-equidistant wavenumber interval selection approach for classifying diesel/biodiesel samples

作者:Soares Felipe*; Anzanello Michel J; Marcelo Marcelo C A; Ferrao Marco F
来源:Chemometrics and Intelligent Laboratory Systems, 2017, 167: 171-178.
DOI:10.1016/j.chemolab.2017.06.005

摘要

In recent years, spectroscopy techniques such as Near infrared (NIR) and Fourier Transform Infrared (Frill) have been widely adopted as analytical tools in different fields and with several purposes. NIR and MR data are typically comprised of hundreds or even thousands of highly correlated wavenumbers, fact that can jeopardize the accuracy of several statistical techniques. In light of that, wavenumber selection emerges as an important step in prediction and classification tasks based on spectroscopy data. This paper proposes a novel framework for wavenumber selection aimed at classifying samples into proper categories, which is applied to two data sets from the petroleum sector. The method relies on two main stages: determination of intervals based on the distance between the average spectra of the classes and selection of the most suitable intervals through cross-validation. An improvement in the misclassification rate was achieved for a NIR spectra data set of diesel, decreasing that metric from 13.90% to 11.63% after the application of the proposed method while retaining 23.19% of the original wavenumbers. As for the biodiesel STIR data set, the method yielded a misclassification rate of 1.21% while retaining 4.95% of the original variables; misclassification rate was 4.71% when all wavenumbers were used. The proposed method also outperformed traditional approaches for wavenumber selection.