Weakly-Supervised Referring Image Segmentation based on Entity Information
-
Graphical Abstract
-
Abstract
To solve the problem of insufficient use of entity information in language based weakly supervised referring image segmentation, this paper proposes a weakly supervised referring image segmentation method based on entity information. On the basis of language supervision, this work utilizes the hidden semantic information of entity words in text to provide effective clues for the visual localization task. Firstly, a candidate entity detection module is proposed, which can recognize all potential objects in the image by extracting entity information. Then an interactive enhancement module is designed to make the features of two-modalities promote the representation ability of each other. This is followed by a response optimization loss which facilitates to generate accurate response map. Finally, the final prediction result is obtained by a matching operation. Experimental results on four large benchmarks, RefCOCO, RefCOCO+, RefCOCOg, and ReferIt, show that compared with existed weakly supervised methods, the proposed method achieves 3.9%, 22.9%, 8.1%, and 9.1% improvement on the mIoU metric by utilizing entity information with only language supervision.
-
-