摘要

Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) have been used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents MRSMO, a MapReduce based distributed SVM algorithm for automatic image annotation. The performance of the MRSMO algorithm is evaluated in an experimental environment. By partitioning the training dataset into smaller subsets and optimizing the partitioned subsets across a cluster of computers, the MRSMO algorithm reduces the training time significantly while maintaining a high level of accuracy in both binary and multiclass classifications.

  • 出版日期2011-10