A Reconfigurable Tangram Model for Scene Representation and Categorization

Zhu, Jun; Wu, Tianfu<sup>*</sup>; Zhu, Song-Chun; Yang, Xiaokang; Zhang, Wenjun

doi:10.1109/TIP.2015.2498407

摘要

This paper presents a hierarchical and compositional scene layout (i.e., spatial configuration) representation and a method of learning reconfigurable model for scene categorization. Three types of shape primitives (i.e., triangle, parallelogram, and trapezoid), called tans, are used to tile scene image lattice in a hierarchical and compositional way, and a directed acyclic AND-OR graph (AOG) is proposed to organize the overcomplete dictionary of tan instances placed in image lattice, exploring a very large number of scene layouts. With certain off-the-shelf appearance features used for grounding terminal-nodes (i.e., tan instances) in the AOG, a scene layout is represented by the globally optimal parse tree learned via a dynamic programming algorithm from the AOG, which we call tangram model. Then, a scene category is represented by a mixture of tangram models discovered with an exemplar-based clustering method. On basis of the tangram model, we address scene categorization in two aspects: 1) building a tangram bank representation for linear classifiers, which utilizes a collection of tangram models learned from all categories and 2) building a tangram matching kernel for kernel-based classification, which accounts for all hidden spatial configurations in the AOG. In experiments, our methods are evaluated on three scene data sets for both the configuration-level and semantic-level scene categorization, and outperform the spatial pyramid model consistently.

出版日期2016-1
单位上海交通大学

全文

访问全文

收藏分享被引(4) 浏览

更新时间：2023-11-14 18:54

A Reconfigurable Tangram Model for Scene Representation and Categorization

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友