摘要

We propose a new dimensionality reduction method called maximum margin projection (MMP), which aims to project data samples into the most discriminative subspace, where clusters are most well-separated. Specifically, MMP projects input patterns onto the normal of the maximum margin separating hyperplanes. As a result, MMP only depends on the geometry of the optimal decision boundary and not on the distribution of those data points lying further away from this boundary. Technically, MMP is formulated as an integer programming problem and we propose a column generation algorithm to solve it. Moreover, through a combination of theoretical results and empirical observations we show that the computation time needed for MMP can be treated as linear in the dataset size. Experimental results on both toy and real-world datasets demonstrate the effectiveness of MMP.

全文