A novel framework for image description generation

Cai Qiang; Xue Ziyu; Zhang Xiaoyu<sup>*</sup>; Zhu Xiaobin; Shao Wei; Wang Lei

doi:10.1007/978-981-10-7299-4_49

摘要

The existing image description generation algorithms always fail to cover rich semantics information in natural images with single sentence or dense object annotations. In this paper, we propose a novel semi-supervised generative visual sentence generation framework by jointly modeling Regions Convolutional Neural Network (RCNN) and improved Wasserstein Generative Adversarial Network (WGAN), for generating diverse and semantically coherent sentence description of images. In our algorithm, the features of candidate regions are extracted with RCNN and the enriched words are polished by their context with an improved WGAN. The improved WGAN consists of a structured sentence generator and a multi-level sentence discriminators. The generator produces sentences recurrently by incorporating region-based visual and language attention mechanisms, while the discriminator assesses the quality of generated sentences. The experimental results on publicly available dataset show the promising performance of our work against other related works.

出版日期2017

全文

访问全文

收藏分享被引浏览

更新时间：2021-01-21 23:01

A novel framework for image description generation

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友