摘要

In this work we consider the problem of training a linear classifier by assuming that the number of data is huge (in particular, data may be larger than the memory capacity). We propose to adopt a linear least-squares formulation of the problem and an incremental recursive algorithm which requires to store a square matrix (whose dimension is equal to the number of features of the data). The algorithm (very simple to implement) converges to the solution using each training data once, so that it effectively handles possible memory issues and is a viable method for linear large scale classification and for real time applications, provided that the number of features of the data is not too large (say of the order of thousands). The extensive computational experiments show that the proposed algorithm is at least competitive with the state-of-the-art algorithms for large scale linear classification.

  • 出版日期2013-2-1