摘要

The mathematical modeling of classifier has been intensively investigated in pattern recognition for decades. Maximin classifier, which conducts optimization based on the perpendicularly closest data point(s) to the decision boundary, has been widely used. However, such method may lead to inferior performance when the boundary data point(s) is significantly influenced by noise. This paper presents a new Linear Max K-min (LMKM) classifier for 2-class classification problems, which offers a general classification solution by considering the K closest data points (K ≥ 1). In other words, given any dataset, the algorithm offers the flexibility to the classification process using the most appropriate number of K boundary points, instead of the the most closet one(s). To tackle the high computational complexity when K or N is relatively large, we propose a new method, which transforms the original objective function into a linear programming problem with 2N constraints which can be solved with high efficiency (where N indicates the number of training samples and K &le N). Experimental study shows that the proposed algorithm consistently offers high quality classification results across 18 publicly available 2-class classification datasets, and meanwhile, outperforms Linear Support Vector Machine (SVM) and Logistic Regression (LR) methods.

全文