摘要

Robust classification methods are vital to the successful implementation of many material characterization techniques, particularly where large databases exist. In this paper, we demonstrate an extremely fast classification method for the identification of mineral mixtures in Raman spectroscopy using the large RRUFF database. However, this method is equally applicable to other techniques meeting the large database criteria, these including laser-induced breakdown, X-ray diffraction, and mass spectroscopy methods. Classification of these multivariate datasets can be challenging due in part to the various obscuring features inherently present within the underlying dataset and in part to the volume and variety of information known a priori. Some of the more specific challenges include the observation of mixtures with overlapping spectral features, the use of large databases (i.e., the number of predictors far outweighs the number of observations), the use of databases that contain groups of correlated spectra, and the ever present, clouding contaminants of noise, undesired background, and spectrometer artifacts. Although many existing classification algorithms attempt to address these problems individually, not many address them as a whole. Here, we apply a multistage approach, which leverages well-established constrained regression techniques, to overcome these challenges. Our modifications to conventional algorithm implementations are shown to increase speed and performance of the classification process. Unlike many other techniques, our method is able to rapidly classify mixtures while simultaneously preserving sparsity. It is easily implemented, has very few tuning parameters, does not require extensive parameter training, and does not require data dimensionality reduction prior to classification.

  • 出版日期2015-8