Abstract:
Current cross-modality person re-identification methods mainly use weight-sharing convolution kernels, which leads to poor dynamic adjustment ability of the model for different inputs. Meanwhile, they mainly use high-level coarse-resolution semantic features, which leads to great information loss. Therefore, this paper proposes a bi-directional dynamic interaction network for cross-modality person re-identification. Firstly, the global feature of different modalities after each residual block is extracted by the dual-flow network. Secondly, according to the global content of different modalities, it dynamically generates a customized convolution kernels to extract the modality-specific characteristics, followed by the integration of modality-complementary characteristics transferring between modalities to alleviate heterogeneity. Finally, the characteristics of different resolutions of each layer are modified to boost a more discriminative and robust characteristic representation. Experimental results on two benchmark RGB-infrared person Re-ID datasets, SYSUMM01 and RegDB demonstrate the effectiveness of the proposed method, which outperforms the state-ofthe-art methods by 4.70% and 2.12% on R1 accuracy respectively, while 4.30% and 2.67% on mAP.