Bi-Directional Dynamic Interaction Network for Cross-Modality Person Re-Identification
-
摘要: 为了解决当前跨模态行人重识别算法因采用权值共享的卷积核而造成模型针对不同输入动态调整能力差,以及现有方法因仅使用高层粗分辨率的语义特征而造成信息丢失的问题,提出一种双向动态交互网络的跨模态行人重识别方法.首先通过双流网络分别提取不同模态各个残差块后的全局特征;然后根据不同模态的全局内容动态地生成定制化卷积核,提取模态特有信息,并将其作为模态互补信息在模态间进行双向传递以缓解模态异质性;最后对各层不同分辨率的特征进行相关性建模,联合学习跨层的多分辨率特征以获取更具有判别性和鲁棒性的特征表示.在SYSU-MM01和RegDB跨模态行人重识别数据集上的实验结果表明,所提方法在第一命中率(R1)分别高于当前最好方法4.70%和2.12%;在平均检索精度(mAP)上分别高于当前最好方法4.30%和2.67%,验证了该方法的有效性.Abstract: Current cross-modality person re-identification methods mainly use weight-sharing convolution kernels, which leads to poor dynamic adjustment ability of the model for different inputs. Meanwhile, they mainly use high-level coarse-resolution semantic features, which leads to great information loss. Therefore, this paper proposes a bi-directional dynamic interaction network for cross-modality person re-identification. Firstly, the global feature of different modalities after each residual block is extracted by the dual-flow network. Secondly, according to the global content of different modalities, it dynamically generates a customized convolution kernels to extract the modality-specific characteristics, followed by the integration of modality-complementary characteristics transferring between modalities to alleviate heterogeneity. Finally, the characteristics of different resolutions of each layer are modified to boost a more discriminative and robust characteristic representation. Experimental results on two benchmark RGB-infrared person Re-ID datasets, SYSUMM01 and RegDB demonstrate the effectiveness of the proposed method, which outperforms the state-ofthe-art methods by 4.70% and 2.12% on R1 accuracy respectively, while 4.30% and 2.67% on mAP.
-
-
[1] Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification[OL]. [2021-08-03]. https://arxiv.org/pdf/1703.07737.pdf
[2] Suh Y, Wang J D, Tang S Y, et al. Part-aligned bilinear representations for person re-identification[C] //Proceedings of the European Conference on Computer Vision. Heidelberg: Springer, 2018: 418-437
[3] Köstinger M, Hirzer M, Wohlhart P, et al. Large scale metric learning from equivalence constraints[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2012: 2288-2295
[4] Hariharan B, Arbeláez P, Girshick R, et al. Hypercolumns for object segmentation and fine-grained localization[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2015: 447-456
[5] Wang Y, Wang L Q, You Y R, et al. Resource aware person re-identification across multiple resolutions[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 8042-8051
[6] Qi L, Huo J, Wang L, et al. A mask based deep ranking neural network for person retrieval[C] //Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). Los Alamitos: IEEE Computer Society Press, 2019: 496-501
[7] Lan X, Zhu X T, Gong S G. Person search by multi-scale matching[C] //Proceedings of the European Conference on Computer Vision. Heidelberg: Springer, 2018: 553-569
[8] Zheng L, Yang Y, Hauptmann A G. Person re-identification: past, present and future[OL]. [2021-08-03]. https://arxiv.org/pdf/1610.02984.pdf
[9] Fang P F, Zhou J M, Roy S, et al. Bilinear attention networks for person retrieval[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2019: 8029-8038
[10] Sun Y F, Zheng L, Yang Y, et al. Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline)[C] //Proceedings of the European Conference on Computer Vision. Heidelberg: Springer, 2018: 501-518
[11] Varior R R, Haloi M, Wang G. Gated siamese convolutional neural network architecture for human re-identification[C] //Proceedings of European Conference on Computer Vision. Heidelberg: Springer, 2016: 791-808
[12] Li Hao, Tang Min, Lin Jianwu, et al. Cross-modality person re-identification framework based on improved hard triplet loss[J]. Computer Science, 2020, 47(10): 180-186(in Chinese) (李灏, 唐敏, 林建武, 等. 基于改进困难三元组损失的跨模态行人重识别框架[J]. 计算机科学, 2020, 47(10): 180-186) [13] Eom C, Ham B. Learning disentangled representation for robust person re-identification[C] //Proceedings of the Advances in Neural Information Processing Systems. New York: ACM Press, 2019: 5297-5308
[14] Zheng Z D, Yang X D, Yu Z D, et al. Joint discriminative and generative learning for person re-identification[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2019: 2133-2142
[15] Yang Wanxiang, Yan Yan, Chen Si, et al. Multi-scale generative adversarial network for person re-identification under occlusion[J]. Journal of Software, 2020, 31(7): 1943-1958(in Chinese) (杨婉香, 严严, 陈思, 等. 基于多尺度生成对抗网络的遮挡行人重识别方法[J]. 软件学报, 2020, 31(7): 1943-1958) [16] Qiu Yaoru, Sun Weijun, Huang Yonghui, et al. Person re-identification method based on GAN uniting with spatial-temporal pattern[J]. Journal of Computer Applications, 2020, 40(9): 2493-2498(in Chinese) (邱耀儒, 孙为军, 黄永慧, 等. 基于生成对抗网络联合时空模型的行人重识别方法[J]. 计算机应用, 2020, 40(9): 2493-2498) [17] Mnih V, Heess N, Graves A, et al. Recurrent models of visual attention[C] //Proceedings of the 27th International Conference on Neural Information Processing Systems. New York: ACM Press, 2014: 2204-2212
[18] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C] //Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM Press, 2017: 6000-6010
[19] Bai S, Bai X, Tian Q. Scalable person re-identification on supervised smoothed manifold[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2017: 3356-3365
[20] Luo C C, Chen Y T, Wang N Y, et al. Spectral feature transformation for person re-identification[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2019: 4975-4984
[21] Shen Y T, Li H S, Yi S, et al. Person re-identification with deep similarity-guided graph neural network[C] //Proceedings of the European Conference on Computer Vision. Heidelberg: Springer, 2018: 486-504
[22] Shen Y T, Li H S, Xiao T, et al. Deep group-shuffling random walk for person re-identification[C] //Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 2265-2274
[23] Lv J M, Chen W H, Li Q, et al. Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 7948-7956
[24] Wu J L, Liu H, Yang Y, et al. Unsupervised graph association for person re-identification[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2019: 8320-8329
[25] Wei L H, Zhang S L, Gao W, et al. Person transfer GAN to bridge domain gap for person re-identification[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 79-88
[26] Deng W J, Zheng L, Ye Q X, et al. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 994-1003
[27] Wang G A, Zhang T Z, Cheng J, et al. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2019:3622-3631
[28] Chen Dan, Li Yongzhong, Yu Peize, et al. Research and prospect of cross modality person re-identification[J]. Computer Systems & Applications, 2020, 29(10): 20-28(in Chinese) (陈丹, 李永忠, 于沛泽, 等. 跨模态行人重识别研究与展望[J]. 计算机系统应用, 2020, 29(10): 20-28) [29] Wu A C, Zheng W S, Yu H X, et al. RGB-infrared cross-modality person re-identification[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2017: 5390-5399
[30] Ye M, Lan X Y, Li J W, et al. Hierarchical discriminative learning for visible thermal person re-identification[C] //Proceedings of Thirty-Second AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018, 32(1): 7501-7506
[31] Ye M, Wang Z, Lan X Y, et al. Visible thermal person re-identification via dual-constrained top-ranking[C] //Proceedings of the 27th International Joint Conference on Artificial Intelligence. New York: ACM Press, 2018: 1092-1099
[32] Ye M, Lan X Y, Leng Q M. Modality-aware collaborative learning for visible thermal person re-identification[C] //Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM Press, 2019: 347-355
[33] Hao Y, Wang N N, Gao X B, et al. Dual-alignment feature embedding for cross-modality person re-identification[C] //Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM Press, 2019: 57-65
[34] Hao Y, Wang N N, Li J, et al. HSME: hypersphere manifold embedding for visible thermal person re-identification[J]. Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33: 8385-8392
[35] Dai P Y, Ji R R, Wang H B, et al. Cross-modality person reidentification with generative adversarial training[C] //Proceedings of the 27th International Joint Conference on Artificial Intelligence. New York: ACM Press, 2018: 677-683
[36] Choi S, Lee S M, Kim Y, et al. Hi-CMD: hierarchical cross-modality disentanglement for visible-infrared person re-identification[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2020: 10254-10263
[37] Li D G, Wei X, Hong X P, et al. Infrared-visible cross-modal person re-identification with an X modality[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 34(4): 4610-4617
[38] Jia M X, Zhai Y P, Lu S J, et al. A similarity inference metric for RGB-infrared cross-modality person re-identification[OL]. [2021-08-03]. https://arxiv.org/pdf/2007.01504.pdf
[39] Kniaz V V, Knyaz V A, Hladůvka J, et al. ThermalGAN: multimodal color-to-thermal image translation for person re-identification in multispectral dataset[C] //Proceedings of the European Conference on Computer Vision. Heidelberg: Springer, 2018: 606-624
[40] Wang Z X, Wang Z, Zheng Y Q, et al. Learning to reduce dual-level discrepancy for infrared-visible person re-identification[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2019: 618-626
[41] Lu Y, Wu Y, Liu B, et al. Cross-modality person re-identification with shared-specific feature transfer[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2020: 13376-13386
[42] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2016: 770-778
[43] Jia X, De Brabandere B, Tuytelaars T, et al. Dynamic filter networks[C] //Proceedings of the 30th Conference on Neural Information Processing Systems. New York: ACM Press, 2016: 667-675
[44] Zhu Y X, Yang Z, Wang L, et al. Hetero-center loss for crossmodality person re-identification[J]. Neurocomputing, 2020, 386: 97-109
[45] Nguyen D T, Hong H G, Kim K W, et al. Person recognition system based on a combination of body images from visible light and thermal cameras[J]. Sensors, 2017, 17(3): 605-605
[46] Ye M, Shen J B, Crandall D J, et al. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification[C] //Proceedings of the European Conference on Computer Vision. Heidelberg: Springer, 2020: 229-247
-
期刊类型引用(1)
1. 闵锋,毛一新,况永刚,彭伟明,郝琳琳,吴波. 图采样泛化行人重识别算法. 计算机工程与应用. 2024(14): 219-227 . 百度学术
其他类型引用(1)
计量
- 文章访问数: 341
- HTML全文浏览量: 10
- PDF下载量: 164
- 被引次数: 2