基于点云稀疏空间特征聚合激励的单阶段3D目标检测模型
Aggregate and Excitate Sparse Spatial Feature: Single-Stage 3D Object Detector from Point Clouds
-
摘要: 针对目前基于点云的3D目标检测中单阶段体素法存在感受野固定、特征尺度单一, 导致模型对点云特征学习不够充分、模型检测效果存在瓶颈等问题, 提出了一种可端对端训练的基于体素的单阶段3D目标检测模型. 首先, 通过多尺度稀疏空间特征聚合模块, 聚合点云在不同稀疏空间尺度上的特征, 使特征充分保留点云的空间信息; 然后, 对特征进行分层激励, 通过多尺度感受野对特征进行分层学习, 强化特征的表达能力, 降低噪声信息对检测结果的影响; 最后, 将特征输入检测头进行候选框的分类和回归. 在公开的自动驾驶数据集KITTI上与主流单阶段3D目标检测模型进行了对比实验, 包含对3类目标共9个的难度等级目标的检测. 所提模型在其中5个等级中的平均准确率有明显提升, 尤其对点云稀疏的目标, 表现出较好的检测效果. 实验结果表明, 所提模型能够充分提取点云空间信息并有效学习点云多尺度特征.Abstract: In order to solve the problems of fixed receptive field and single feature scale in the single-stage voxel-based 3D object detectors, which cause insufficient learning of point clouds features and bottleneck in detection performance, a voxel-based single-stage 3D object detection model is proposed which can be trained end-to-end. First, the multi-scale sparse spatial feature aggregation module is used to aggregate the features of point clouds at different sparse spatial scales, so that the features can fully preserve the spatial information of point clouds. Then, the detector performs hierarchical learning of features through multi-scale receptive fields by feature hierarchical excitation module, which can strengthen the representation of the features, and reduce the influence of noise information to make the detector more robust. Finally, the features are fed into the detection head for classification and regression of candidate boxes. Extensive experiments on the public autonomous driving dataset KITTI compared with other mainstream single-stage 3D object detectors, including 3 categories of 9 difficulty levels, demonstrate that, the average accuracy of the proposed model is significantly improved in 5 levels, especially for objects with sparse point clouds. The experimental results show that the proposed model can fully extract the spatial information and effectively learn the multi-scale features of the point clouds.