高级检索
钱小燕, 施俞洲, 张峰, 朱新瑞, 韩磊, 李智昱. 采用重参数网络的Transformer目标跟踪[J]. 计算机辅助设计与图形学学报, 2023, 35(10): 1521-1531. DOI: 10.3724/SP.J.1089.2023.19686
引用本文: 钱小燕, 施俞洲, 张峰, 朱新瑞, 韩磊, 李智昱. 采用重参数网络的Transformer目标跟踪[J]. 计算机辅助设计与图形学学报, 2023, 35(10): 1521-1531. DOI: 10.3724/SP.J.1089.2023.19686
Qian Xiaoyan, Shi Yuzhou, Zhang Feng, Zhu Xinrui, Han Lei, Li Zhiyu. Transformer Object Tracking Based on Re-Parameterization Network[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(10): 1521-1531. DOI: 10.3724/SP.J.1089.2023.19686
Citation: Qian Xiaoyan, Shi Yuzhou, Zhang Feng, Zhu Xinrui, Han Lei, Li Zhiyu. Transformer Object Tracking Based on Re-Parameterization Network[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(10): 1521-1531. DOI: 10.3724/SP.J.1089.2023.19686

采用重参数网络的Transformer目标跟踪

Transformer Object Tracking Based on Re-Parameterization Network

  • 摘要: 针对目前基于 Siamese 结构的目标跟踪计算量大且不能实现模板与搜索区域间充分信息交融的问题, 提出基于重参数网络的 Transformer 目标跟踪算法. 首先采用重参数法降低跟踪过程中骨干网络计算量, 训练时采用多分支并行结构, 测试跟踪过程中使用重参数法将多分支并行结构重构成单分支串行结构; 然后对骨干网络提取的模板特征图和搜索区域特征图使用 Transformer 结构进行自注意力加强, 通过交叉注意力层实现像素级信息交融; 最后将完成充分交融的信息映射到分类分支、中心度估计分支与边框回归分支, 其中, 边框回归分支采用最新的 CIoU-Loss进行训练, 使得跟踪算法精确度更高, 具有更强鲁棒性的同时满足实时性. 实验结果表明, 所提算法在大规模基准数据集 GOT-10k 上平均重叠率为 0.606, 超越 SiamFC++算法 0.011; 在大规模数据集 LaSOT 上, 成功率、归一化精确度、精确度分别达到 0.554, 0.659 和 0.581, 比 SiamFC++算法提高了 0.010, 0.036 和 0.034.

     

    Abstract: In order to decrease the amount of computation and improve the tracking performance, this paper proposes a Transformer tracking algorithm based on a re-parameterization mechanism. Firstly, the backbone network was redesigned based on re-parameterization technique. Multi-branch parallel structure was adopted in the training and this structure was reconstructed into a single-branch serial structure by using the re-parameterization technique in the tracking process. Secondly, the template feature map and search feature map extracted from the backbone network were strengthened by using Transformer’s multi-head self-attention layer. The cross-attention layer was used to achieve full pixel-level information fusion between feature maps to enhance the discriminant ability for tracking targets. Finally, the bounding box regression branch was trained with the latest CIoU-Loss function. Comparison with current tracking methods shows that Average Overlaprate of the proposed method reaches 0.606 when testing on dataset GOT-10k which exceeds SiamFC++ by 0.011. Based on LaSOT, the proposed method gets Success Rate (SR) 0.554, normalization accuracy 0.659 and accuracy 0.581 which has the promotion of 0.010, 0.036 and 0.034 compared to SiamFC++ respectively.

     

/

返回文章
返回