摘要

Recently, region based methods for estimating the 3D pose of an object from a 2D image have gained increasing popularity. They do not require prior knowledge of the object's texture, making them particularity attractive when the object's texture is unknown a priori. Region based methods estimate the 3D pose of an object by finding the pose which maximizes the image segmentation in to foreground and background regions. Typically the foreground and background regions are described using global appearance models, and an energy function measuring their fit quality is optimized with respect to the pose parameters. Applying a region based approach on standard 2D-3D pose estimation databases shows its performance is strongly dependent on the scene complexity. In simple scenes, where the statistical properties of the foreground and background do not spatially vary, it performs well. However, in more complex scenes, where the statistical properties of the foreground or background vary, the performance strongly degrades. The global appearance models used to segment the image do not sufficiently capture the spatial variation. Inspired by ideas from local active contours, we propose a framework for simultaneous image segmentation and pose estimation using multiple local appearance models. The local appearance models are capable of capturing spatial variation in statistical properties, where global appearance models are limited. We derive an energy function, measuring the image segmentation, using multiple local regions and optimize it with respect to the pose parameters. Our experiments show a substantially higher probability of estimating the correct pose for heterogeneous objects, whereas for homogeneous objects there is minor improvement.

  • 出版日期2016-5