Automatic image-text alignment for large-scale web image indexing and retrieval

Zhou, Ning<sup>*</sup>; Fan, Jianping

doi:10.1016/j.patcog.2014.07.001

摘要

In this paper, an automatic image-text alignment algorithm is developed to achieve more effective indexing and retrieval of large-scale web images by aligning web images with their most relevant auxiliary text terms or phrases. First, a large number of cross-media web pages (which contain web images and their auxiliary texts) are crawled and segmented into a set of image-text pairs (informative web images and their associated text terms or phrases). Second, near-duplicate image clustering is used to group large-scale web images into a set of clusters of near-duplicate images according to their visual similarities. The near-duplicate web images in the same cluster share similar semantics and are simultaneously associated with a same or similar set of auxiliary text terms or phrases which co-occur frequently in the relevant text blocks, thus performing near-duplicate image clustering can significantly reduce the uncertainty on the relatedness between the semantics of web images and their auxiliary text terms or phrases. Finally, random walk is performed over a phrase correlation network to achieve more precise image-text alignment by refining the relevance scores between the web images and their auxiliary text terms or phrases. Our experiments on algorithm evaluation have achieved very positive results on large-scale cross-media web pages.

出版日期2015-1
单位西北大学

全文

访问全文

收藏分享被引(11) 浏览

更新时间：2021-07-16 07:35

Automatic image-text alignment for large-scale web image indexing and retrieval

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友