动态场景下流感知的视觉定位方法

王佳慧; 来林静; 张磊

doi:10.3724/SP.J.1089.2023-00426

动态场景下流感知的视觉定位方法

Streaming Perception-Based Visual Localization Method in Dynamic Scenes

摘要

摘要: 在动态场景下, 视觉同时定位与地图构建(simultaneous localization and mapping, SLAM)通常与深度学习方法结合提高系统的定位精度. 针对深度学习方法运行时产生的时间延迟, 导致系统难以达到流式处理要求的问题,提出一种面向动态场景下视觉 SLAM 的流感知定位方法. 首先针对传统评估指标只考虑定位精度的问题, 提出流式评估指标, 该指标同时考虑定位精度和时间延迟, 能够准确反映系统的流式处理性能; 其次针对传统视觉 SLAM 方法无法实现流式处理的问题, 提出流感知的视觉定位方法, 通过多线程并行和相机位姿预测相结合的方式, 获得持续稳定的相机位姿输出. 在 BONN 数据集和真实场景上的实验结果表明, 所提方法能够有效地提升动态场景下采用深度学习方法的视觉定位的流性能. 基于 BONN 数据集和流式评估方式的评估结果表明, 与 DynaSLAM 方法对比,所提方法的绝对轨迹误差(APE), 相对平移误差(RPE_trans)和相对旋转误差(RPE_angle)分别下降 80.438%, 56.180%和 54.676%. 在真实场景下的实验结果表明, 所提方法可以得到与实际相符的相机轨迹.

Abstract: In dynamic scenes, simultaneous localization and mapping (SLAM) is usually combined with deep learning methods to improve the localization accuracy of the system. In order to solve the problem of time delay generated during the operation of deep learning methods, which makes it difficult for the system to meet the requirements of streaming processing, a streaming perception-based visual SLAM method in dynamic scenes is proposed. Firstly, to address the limitation of traditional evaluation metrics that only consider localization accuracy, a streaming evaluation metric is proposed, which considers both localization accuracy and time delay. This metric can accurately reflect the streaming processing performance of the system. Secondly, to address the incapability of traditional visual SLAM methods to enable streaming processing, a streaming perception-based visual SLAM method is proposed, which combines multi-thread parallelism and camera pose prediction to obtain continuous and stable camera pose outputs. The evaluation results based on the BONN datasets and streaming evaluation method show that, compared with DynaSLAM, the absolute trajectory error (APE), relative translation error (RPE_trans), and relative rotation error (RPE_angle) of the proposed method have decreased by 80.438%, 56.180% and 54.676% respectively. In real-world scenes, the proposed method can obtain camera trajectories that are consistent with reality.

HTML全文

参考文献(37)

施引文献

资源附件(0)