A Geometric Approach to Train SVM on Very Large Data Sets

作者:Zeng Zhi Qiang*; Xu Hua Rong; Xie Yan Qi; Gao Ji
来源:3rd International Conference on Intelligent System and Knowledge Engineering, 2008-11-17 to 2008-11-19.

摘要

Reduced set method is an important approach to speed zip support vector machine (SVM) training on large data sets. Existing works mainly focused on selecting patterns near the decision boundary for SVM training by applying clustering, nearest neighbor algorithm and so on. However, on very large data sets, these algorithms require huge computational overhead, and thus the total running time is still enormous. In this paper, an intuitive geometric method is developed to select convex hull samples in the feature space for SVM training, which has a time complexity that is linear with training set size n. Experiments on real data sets show that the proposed method not only preserves the generalization performance of the result SVM classifiers, but outperforms existing scale-up methods in terms of training time and number of support vectors.