基于高效点云-图像跨模态特征交互的三维目标检测方法

梁小雨; 高伟; 汪顺舟

doi:10.3724/SP.J.1089.2025-00043

基于高效点云-图像跨模态特征交互的三维目标检测方法

3D Object Detection Method based on Efficient Cross-modal Feature Interaction between Point Cloud and Image

摘要

摘要: 三维目标检测是自动驾驶领域的关键任务. 针对现有方法在跨模态特征交互过程中存在结构复杂、计算资源消耗较大, 限制了其应用范围的问题, 提出一种基于跨模态特征交互融合的三维目标检测方法. 首先基于鸟瞰图的空间对齐模块实现跨模态特征对齐, 提升检测精度; 为了获得更全面的特征表示, 设计一种结合多尺度大核卷积与跨模态特征选择旁支的融合策略; 最后隐式地引导特征在维度与空间高度上的对齐, 比传统的显式对齐方法更加高效简洁. 在nuScenes数据集上的实验结果表明, 与基线方法相比, 所提方法在nuScenes检测分数(NDS)指标上提升0.2, 在均值平均精度(mAP)指标上提升0.6, 验证了该方法在多模态三维目标检测任务中的有效性.

Abstract: 3D object detection is a critical task in the field of autonomous driving., Existing methods often suffer from complex architectures and high computational costs in cross-modal feature interaction, which limits the application range of methods. To address these challenges, proposes a cross-modal feature interaction and fusion-based 3D object detection method. Specifically, a bird’s-eye-view-based spatial feature alignment module is designed to achieve cross-modal feature alignment, thereby enhancing detection accuracy. To obtain a more comprehensive feature representation, introduce a fusion strategy that combines multi-scale large kernel convolutions with a cross-modal feature selection branch, making it more efficient and concise compared to traditional explicit alignment methods. Experimental results on the nuScenes dataset demonstrate that, compared with the baseline method, our approach achieves an improvement of 0.2 in nuScenes detection score (NDS) and 0.6 in mean Average Precision (mAP), validating its effectiveness in multimodal 3D object detection tasks.

HTML全文

参考文献(0)

施引文献

资源附件(0)