Unsupervised Visual and Textual Information Fusion in CBMIR Using Graph-Based Methods

Ah Pine Julien<sup>*</sup>; Csurka Gabriela; Clinchant Stephane

doi:10.1145/2699668

摘要

Multimedia collections are more than ever growing in size and diversity. Effective multimedia retrieval systems are thus critical to access these datasets from the end-user perspective and in a scalable way. We are interested in repositories of image/text multimedia objects and we study multimodal information fusion techniques in the context of content-based multimedia information retrieval. We focus on graph-based methods, which have proven to provide state-of-the-art performances. We particularly examine two such methods: cross-media similarities and random-walk-based scores. From a theoretical viewpoint, we propose a unifying graph-based framework, which encompasses the two aforementioned approaches. Our proposal allows us to highlight the core features one should consider when using a graph-based technique for the combination of visual and textual information. We compare cross-media and random-walk-based results using three different real-world datasets. From a practical standpoint, our extended empirical analyses allow us to provide insights and guidelines about the use of graph-based methods for multimodal information fusion in content-based multimedia information retrieval.

出版日期2015-2

全文

访问全文

收藏分享被引(26) 浏览

更新时间：2024-05-04 15:12

Unsupervised Visual and Textual Information Fusion in CBMIR Using Graph-Based Methods

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友