A quantitative evaluation of the conceptual consistency of visual words and visual vocabularies

Stommel M<sup>*</sup>; Herzog O; Xu W L

doi:10.1016/j.jvcir.2014.11.015

摘要

Codebooks are a widely accepted technique to recognise objects by sets of local features. The method has been applied to many classes of objects, even very abstract ones. But although state of the art recognition rates have been reported, the method is still far away from being reliable in any sense that is related to human vision. The literature on this topic emphasises detailed descriptions of statistical estimators over a basic analysis of the data. A deeper understanding of the data is however needed to achieve a further development of the field. In this paper, we therefore present a set of quantitative experiments on codebooks of the popular SIFT descriptors. The results discourage the use of illustrative but overly simplifying descriptions of the visual words approach. It is in particular demonstrated that (1) there are more visually distinct patterns than can be listed in a codebook, (2) one element of a codebook represents a set of many, visually distinct patterns, and (3) there are no single, selective SIFT descriptors to serve as codebook elements. This makes us wonder why the method works after all. We discuss several options.

出版日期2015-4

全文

访问全文

收藏分享被引浏览

更新时间：2024-01-22 09:03

A quantitative evaluation of the conceptual consistency of visual words and visual vocabularies

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友