摘要
Solid protocols to benchmark local feature detectors and descriptors were introduced by Mikolajczyk et al. [1,2]. The detectors and the descriptors are popular tools in object class matching, but the wide baseline setting in the benchmarks does not correspond to class-level matching where appearance variation can be large. We extend the benchmarks to the class matching setting and evaluate state-of-the-art detectors and descriptors with Caltech and ImageNet classes. Our experiments provide important findings with regard to object class matching: (1) the original SIFT is still the best descriptor; (2) dense sampling outperforms interest point detectors with a clear margin; (3) detectors perform moderately well, but descriptors' performance collapses; (4) using multiple, even a few, best matches instead of the single best has significant effect on the performance; (5) object pose variation degrades dense sampling performance while the best detector (Hessian-affine) is unaffected. The performance of the best detector descriptor pair is verified in the application of unsupervised visual class alignment where state-of-the-art results are achieved. The findings help to improve the existing detectors and descriptors for which the framework provides an automatic validation tool.
- 出版日期2016-4-5