Cross-Modal Retrieval Based on Full-Modal Autoencoder with Generative Adversarial Mechanism
-
Graphical Abstract
-
Abstract
Existing cross-modal retrieval methods based on generative adversarial networks can’t fully ex-plore the inter-modality invariance.Aiming to solve the problem,a novel cross-modal retrieval method based on full-modal autoencoder with generative adversarial mechanism is proposed.Two parallel full-mo-dal autoencoders are introduced to embed samples of different modalities into a common space.Each full-modal autoencoder not only reconstructs the feature representation of its own modality,but also recon-structs the feature representation of the other modality.A classifier is designed to predict the categories of the embedding features in the common space,which aims to preserve the semantic discriminative informa-tion of samples.Three discriminators are designed to determine the modal categories of the input features,respectively,and these three discriminators work cooperatively to fully explore the inter-modality invari-ance.The mean average precision(mAP)is used to evaluate the accuracy of cross-modal retrieval and ex-tensive experiments are conducted on three public datasets which are Pascal Sentence,Wikipedia and NUS-WIDE-10k.Compared to ten state-of-the-art cross-modal retrieval methods including traditional methods and deep learning methods,the mAP of the proposed method on the three datasets improves at least 4.8%,1.4%and 1.1%on the three datasets respectively.The experimental results prove the effectiveness of the proposed method.
-
-