摘要

Cross-modal retrieval has attracted much attention in recent years due to its widespread applications. In this area, how to capture and correlate heterogeneous features originating from different modalities remains a challenge. However, most existing methods dealing with cross-modal learning only focus on learning relevant features shared by two distinct feature spaces, therefore overlooking discriminative feature information of them. To remedy this issue and explicitly capture discriminative feature information, we propose a novel cross-modal retrieval approach based on discriminative dictionary learning that is augmented with common label alignment. Concretely, a discriminative dictionary is first learned to account for each modality, which boosts not only the discriminating capability of intra-modality data from different classes but also the relevance of inter-modality data in the same class. Subsequently, all the resulting sparse codes are simultaneously mapped to a common label space, where the cross-modal data samples are characterized and associated. Also in the label space, the discriminativeness and relevance of the considered cross-modal data can be further strengthened by enforcing a common label alignment. Finally, cross-modal retrieval is performed over the common label space. Experiments conducted on two public cross-modal datasets show that the proposed approach outperforms several state-of-the-art methods in term of retrieval accuracy.