摘要

Enhancing perception of the local environment with semantic information like the room type is an important ability for agents acting in their environment. Such high-level knowledge can reduce the effort needed for, for example, object detection. This paper shows how to extract the room label from a small amount of room percepts taken from a certain view point (like the door frame when entering the room). Such functionality is similar to the human ability to get a scene impression from a quick glance. We propose a new three-dimensional (3D) spatial feature vector that captures the layout of a scene from extracted planar surfaces. The trained models emulate the human brain sensitivity to the 3D geometry of a room. Further, we show that our descriptor complements the information encoded by the Gist feature vector a first attempt to model the mentioned brain area. The global scene properties are extracted from edge information in 2D depictions of the scene. Both features can be fused, resulting in a system that follows our goal to combine psychological insights on human scene perception with physical properties of environments. This paper provides detailed insights into the nature of our spatial descriptor.

  • 出版日期2014-5