面向跨视图行人重识别的多级判别性字典学习算法

汤红忠; 陈天宇; 邓仕俊; 张小刚

doi:10.3724/SP.J.1089.2020.18100

面向跨视图行人重识别的多级判别性字典学习算法

Multi-Level Discriminative Dictionary Learning Method for Cross-View Person Re-Identification

摘要

摘要: 现有的行人重识别算法主要聚焦于如何提取更有效的分类特征和如何学习更鲁棒的距离度量函数.在现实场景中,不同视图下的同一行人图像的分辨率往往不一致,且同一视图下受视角和光照变化等因素的影响存在提取的分类特征判别性弱且鲁棒性不强的问题.针对这一问题,利用不同视图的特征表示中编码系数的潜在关联,提出了一种多级判别性字典学习算法,并将其应用于跨视图行人重识别.首先,在图像水平区域和图像级别的字典学习算法中分别引入了一个特征映射矩阵,该矩阵可以描述不同视图下同一行人图像编码系数之间的内在关系,可以极大地提高编码系数的灵活性.其次,在图像块级别,结合图像的局部流形结构,在字典学习目标函数中增加了字典原子的局部几何结构约束,通过自适应学习图拉普拉斯矩阵,确保编码系数保持了与样本相似的几何结构,可以获得更具判别性的字典对.最后,文中算法在2个被广泛使用的行人重识别数据集VIPeR和CHUK01 Campus上进行验证,2个数据集在rank-1上的识别率分别为68.40%和80.14%,实验结果表明,文中算法不仅可以降低不同视图下分辨率差异明显的影响,而且大大提高了学习字典对的表示能力和鉴别能力,与其他算法相比获得了更好的行人重识别精度.

Abstract: Most existing person re-identification work focuses on either extracting discriminative features or learning discriminative distance metrics.However,in the complex real-world scenarios,large variation of resolutions between the same person image observed in different cameras is existed,and the person image in the same camera also undergoes large variation of view and illumination,these factors limit the representation ability and robustness performance of extracted features.To address this problem,we propose a cross-view multi-level discriminative dictionary learning method by utilizing the intrinsic correlation of coding coefficients in the feature representation of different views in this paper.First,a feature mapping function is introduced in the dictionary learning model of the image horizontal region-level and image-level to bridge the gaps of the cross-view image.Through the mapping function,the stringent correspondence relation is relaxed between the cross-view images,thereby leaving the coding coefficients more flexibility to maximize the feature representation performance.Then,on the patch-level,we incorporate a local geometry constraint on atoms into the dictionary learning objective function by considering the local manifold structure of the image patch.By learning a graph Laplace matrix adaptively,the local geometry structure of training samples can be mapped to the coding coefficients.Therefore,a more discriminative dictionary pairs can be obtained.Experiments on the two challenging person re-identification datasets demonstrate the proposed method can reduce the influence of large variation of resolution in the different cameras and improve the representative and discriminative abilities of learned dictionaries.The effectiveness of the proposed approach is validated on the VIPeR dataset and the CUHK01 dataset,and the best rank-1 matching rate are reached 68.40%,80.14%respectively.Compared with the state-of-the-art algorithms,the proposed method can improve the performance of person re-identification.

HTML全文

参考文献(0)

施引文献

资源附件(0)