基于通道干预渐进式差异减小网络的跨模态行人重识别

刘志刚; 常乐乐; 赵宜珺; 刘苗苗

doi:10.3724/SP.J.1089.2023-00541

基于通道干预渐进式差异减小网络的跨模态行人重识别

Progressive Difference Reduction Network with Channel Intervention for Visible-Infrared Re-Identification

摘要

摘要: 在跨模态行人重识别研究领域中,可见光与红外图像的模态差异是增大共享特征提取难度的关键问题.为了降低2种模态的差异,提高行人重识别性能,提出一种渐进式差异减小网络.在可见光目标到红外图像集的检索识别阶段,根据因果推理理论设计一个特定事实干预模块,通过通道变换生成的干预图像完成对可见光图像的干预,抑制可见光图像中的颜色信息干扰;其在红外目标到可见光图像集的检索识别阶段,设计一个通道协调模块,将多通道的特征提取转换为单通道方式,使网络专注于学习可见光与红外2种图像的通道相关性;最后,针对可见光和红外2种目标图像的相互检索识别提出模态平衡损失方法,通过干预图像、可见光图像和红外图像完成多个模态的平衡学习,进一步完成颜色特征抑制,补偿可见光图像在特定事实干预过程中的可鉴别丢失特征.仿真实验结果表明,与现有主流的跨模态行人重识别方法相比,所提网络在SYSU-MM01和RegDB这2个标准数据集上均取得了较好的性能表现,rank1和mAP分别提高超过2%.网络源代码:https://cstr.cn/31253.11.sciencedb.27692.

Abstract: In the research field of cross-modal person re-identification (ReID), the modality difference between visible light and infrared modalities is a key issue that increases the difficulty of extracting shared features. To mitigate the differences between these two modalities and enhance the performance of person re-identification, a progressive difference reduction network (PDRNet) is proposed. During the retrieval and recognition stages from visible light targets to infrared image sets, a specific factual intervention module is designed based on causal reasoning theory. It intervenes with visible light images through channel transformation to generate intervened images, suppressing color information interference in visible light images. In the recognition stage from infrared targets to visible light image sets, a channel coordination module is designed to transform multi-channel feature extraction into a single-channel manner, focusing the internet on learning the channel correlation between visible light and infrared images. Finally, a modality balance loss method is proposed for mutual retrieval and recogni tion of visible light and infrared target images. It achieves balanced learning across multiple modalities by utilizing intervened images, visible light images, and infrared images, further suppressing color features and compensating for the discriminative loss features in the specific factual intervention process of visible light images. Simulation experimental results demonstrate that the proposed network outperforms existing mainstream cross-modal person re-identification methods on the SYSU-MM01 and RegDB standard datasets, with rank1 and mAP improving by over 2%. Internet source: https://cstr.cn/31253.11.sciencedb.27692.

HTML全文

参考文献(36)

施引文献

资源附件(0)