Papers
Topics
Authors
Recent
2000 character limit reached

Active Mining Sample Pair Semantics for Image-text Matching

Published 9 Nov 2023 in cs.CV | (2311.05425v1)

Abstract: Recently, commonsense learning has been a hot topic in image-text matching. Although it can describe more graphic correlations, commonsense learning still has some shortcomings: 1) The existing methods are based on triplet semantic similarity measurement loss, which cannot effectively match the intractable negative in image-text sample pairs. 2) The weak generalization ability of the model leads to the poor effect of image and text matching on large-scale datasets. According to these shortcomings. This paper proposes a novel image-text matching model, called Active Mining Sample Pair Semantics image-text matching model (AMSPS). Compared with the single semantic learning mode of the commonsense learning model with triplet loss function, AMSPS is an active learning idea. Firstly, the proposed Adaptive Hierarchical Reinforcement Loss (AHRL) has diversified learning modes. Its active learning mode enables the model to more focus on the intractable negative samples to enhance the discriminating ability. In addition, AMSPS can also adaptively mine more hidden relevant semantic representations from uncommented items, which greatly improves the performance and generalization ability of the model. Experimental results on Flickr30K and MSCOCO universal datasets show that our proposed method is superior to advanced comparison methods.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.