面向跨模态行人重识别的双向动态交互网络

郑爱华; 冯孟雅; 李成龙; 汤进; 罗斌

doi:10.3724/SP.J.1089.2023.19280

面向跨模态行人重识别的双向动态交互网络

Bi-Directional Dynamic Interaction Network for Cross-Modality Person Re-Identification

摘要

摘要: 为了解决当前跨模态行人重识别算法因采用权值共享的卷积核而造成模型针对不同输入动态调整能力差,以及现有方法因仅使用高层粗分辨率的语义特征而造成信息丢失的问题,提出一种双向动态交互网络的跨模态行人重识别方法.首先通过双流网络分别提取不同模态各个残差块后的全局特征;然后根据不同模态的全局内容动态地生成定制化卷积核,提取模态特有信息,并将其作为模态互补信息在模态间进行双向传递以缓解模态异质性;最后对各层不同分辨率的特征进行相关性建模,联合学习跨层的多分辨率特征以获取更具有判别性和鲁棒性的特征表示.在SYSU-MM01和RegDB跨模态行人重识别数据集上的实验结果表明,所提方法在第一命中率(R1)分别高于当前最好方法4.70%和2.12%;在平均检索精度(mAP)上分别高于当前最好方法4.30%和2.67%,验证了该方法的有效性.

Abstract: Current cross-modality person re-identification methods mainly use weight-sharing convolution kernels, which leads to poor dynamic adjustment ability of the model for different inputs. Meanwhile, they mainly use high-level coarse-resolution semantic features, which leads to great information loss. Therefore, this paper proposes a bi-directional dynamic interaction network for cross-modality person re-identification. Firstly, the global feature of different modalities after each residual block is extracted by the dual-flow network. Secondly, according to the global content of different modalities, it dynamically generates a customized convolution kernels to extract the modality-specific characteristics, followed by the integration of modality-complementary characteristics transferring between modalities to alleviate heterogeneity. Finally, the characteristics of different resolutions of each layer are modified to boost a more discriminative and robust characteristic representation. Experimental results on two benchmark RGB-infrared person Re-ID datasets, SYSUMM01 and RegDB demonstrate the effectiveness of the proposed method, which outperforms the state-ofthe-art methods by 4.70% and 2.12% on R1 accuracy respectively, while 4.30% and 2.67% on mAP.

HTML全文

参考文献(46)

施引文献

资源附件(0)