动态场景下流感知的视觉定位方法
Streaming Perception-Based Visual Localization Method in Dynamic Scenes
-
摘要: 视觉同时定位与地图构建(Simultaneous Localization and Mapping, SLAM)通过视觉传感器数据计算相机位姿来实现定位. 在动态场景下, 它通常与深度学习方法结合提高系统的定位精度. 但是深度学习方法往往对计算资源的要求较高, 导致系统运行时产生一定的时间延迟, 难以达到流式处理的要求. 针对这些问题, 本文基于流式评估指标, 提出一种针对动态场景下视觉SLAM的流感知定位方法. 该流式评估指标同时考虑定位精度和算法时间延迟, 能够准确反映系统的流式处理性能. 在此基础上, 通过多线程并行和相机位姿预测相结合的方式, 获得持续稳定的相机位姿输出, 实现流感知的视觉定位. 实验结果表明, 该方法能够有效提升动态场景下采用深度学习方法的视觉定位的流性能.Abstract: Simultaneous Localization and Mapping (SLAM) utilizes visual sensor data to estimate camera poses for localization. In dynamic scenes, SLAM is often augmented with deep learning methods to improve localization accuracy. However, deep learning methods typically demand high computational resources, resulting in time delays during system operation that hinders real-time streaming processing. To address these challenges, this paper proposes a streaming perception-based visual SLAM method for dynamic scenes, which integrates a streaming evaluation metric that considers both localization accuracy and algorithmic time delay to reflect the system's streaming processing performance accurately. Based on this metric, the proposed method employs multi-thread parallelism and camera pose prediction to obtain continuous and stable camera pose outputs, enabling streaming perception-based visual localization. Experimental results demonstrate that the proposed method effectively improves the streaming performance of visual localization using deep learning methods in dynamic scenes.