摘要

Linear discriminant analysis models to minimize misclassification cost have recently gained popularity. It is well known that the misclassification cost minimizing linear discriminant analysis problem is an NP-complete problem that is difficult to solve to optimality for large scale datasets. As a result, heuristic techniques have gained popularity but it is difficult to assess how well these heuristic techniques perform. One way to aid assessment of the performance of heuristic techniques is to establish a lower-bound on the optimal value of misclassification cost. In this paper, we propose and use a hybrid particle swarm optimization (PSO) and Lagrangian relaxation (LR) based heuristic to establish a misclassification cost lower bound (MCLB) for two-group linear classifiers. We use the subgradient optimization procedure to tighten the MCLB. Using simulated and real-world datasets, we test a misclassification cost minimizing linear genetic algorithm classifier and two commercial non-linear classifiers (C5.0 and C&RT) to compare their performances with the MCLB. Our holdout sample tests indicate that the proposed MCLB works well for both linear and non-linear classifiers when class data distributions are normal. Additionally, as misclassification cost asymmetry increases, the proposed MCLB appears to provide better results.

  • 出版日期2014-3

全文