高级检索

粗到精的非对称2D-3D稠密对应关系计算

Asymmetric 2D-3D Dense Correspondence Computation from Coarse-to-Fine

  • 摘要: 针对现有2D-3D对应关系计算方法忽视图像与点云间视角、模态非对称, 且图像深度特征信息利用不充分导致稠密对应效果较差的问题, 提出基于由粗到精对应策略的空间一致性非对称2D-3D稠密对应关系计算方法. 首先将2D图像与3D点云分别送入包含WFC-CRF模块的图像骨干网络与包含位置嵌入的点云骨干网络中, 从不同的维度和尺度上深入挖掘图像、点云数据中的潜在信息; 然后在粗对应阶段, 将点云块与图像块进行空间一致性多尺度对应, 并利用Sinkhorn算法对经过粗匹配的块级特征进行后处理, 减少因视角、模态不对称导致的对应误差, 得到块级别的粗略对应关系; 最后在精细对应阶段, 使用重采样与注意力机制进行细化对应, 获得点级别的非对称2D-3D稠密对应关系. 在7Scenes数据集上的内点率、配准召回率和特征召回率分别达到60.5%、79.5%和93.7%, 在RGB-D Scenes V2数据集上的内点率、配准召回率和特征召回率分别达到37.2%、63.2%和93.5%, 能有效地提高稠密对应关系计算的准确率, 具有良好的泛化能力.

     

    Abstract: Aiming at the problem that the existing 2D-3D correspondence computation methods ignore the asymmetric viewpoints and modalities between the image and the point cloud, and do not fully utilize the depth feature information of the image, which leads to the poor effect of dense correspondence, we propose a novel spatial-consistency asymmetric 2D-3D dense correspondence calculation method based on a coarse-to-fine matching strategy. Initially, the 2D image and 3D point cloud are processed through an image backbone network with a window fully-connected conditional random field (WFC-CRF) module and a point cloud backbone network with positional embedding, respectively. This step is aimed at extracting multi-scale spatial features from both image and point cloud data. Next, in the coarse correspondence phase, multi-scale spatial-consistent matching is performed between image patches and point cloud blocks. The Sinkhorn algorithm is employed to post-process the block-level features, reducing the errors caused by perspective and modality asymmetry, and obtaining coarse block-level correspondences. Finally, the asymmetric 2D-3D dense correspondences at the point level are obtained by using resampling and attention mechanism to refine the correspondences in the fine correspondence stage. The inlier ratio, registration recall rate, and feature recall rate achieve 60.5%, 79.5%, and 93.7% respectively on the 7Scenes dataset, while reaching 37.2%, 63.2%, and 93.5% on the RGB-D Scenes V2 dataset. This approach effectively enhances the accuracy of dense correspondence computation and demonstrates strong generalization capabilities across different scenarios.

     

/

返回文章
返回