A novel framework for image description generation

作者:Cai Qiang; Xue Ziyu; Zhang Xiaoyu*; Zhu Xiaobin; Shao Wei; Wang Lei
来源:2nd Chinese Conference on Computer Vision, CCCV 2017, 2017-10-11 To 2017-10-14.
DOI:10.1007/978-981-10-7299-4_49

摘要

The existing image description generation algorithms always fail to cover rich semantics information in natural images with single sentence or dense object annotations. In this paper, we propose a novel semi-supervised generative visual sentence generation framework by jointly modeling Regions Convolutional Neural Network (RCNN) and improved Wasserstein Generative Adversarial Network (WGAN), for generating diverse and semantically coherent sentence description of images. In our algorithm, the features of candidate regions are extracted with RCNN and the enriched words are polished by their context with an improved WGAN. The improved WGAN consists of a structured sentence generator and a multi-level sentence discriminators. The generator produces sentences recurrently by incorporating region-based visual and language attention mechanisms, while the discriminator assesses the quality of generated sentences. The experimental results on publicly available dataset show the promising performance of our work against other related works.

  • 出版日期2017

全文