Local Correspondence Aware Method for Cross-Modal Learning on Point Clouds

Han Li; Xie Xinyang; Zhang Yuping; Chen Xueying

doi:10.3724/SP.J.1089.2024-00752

Han Li, Xie Xinyang, Zhang Yuping, Chen Xueying. Local Correspondence Aware Method for Cross-Modal Learning on Point Clouds[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.2024-00752

Citation:

Local Correspondence Aware Method for Cross-Modal Learning on Point Clouds

Graphical Abstract

Graphical Abstract

Abstract

Abstract

To address the issue of insufficient exploration of feature complementarity and correlation in cross-modal learning, this paper proposes a novel local correspondence aware method for cross-modal learning on point clouds. Based on a dual-channel learning framework, a local correspondence aware module is designed to compute the local semantic correlation between point cloud features and image features based on a constructed image semantic guidance matrix, enhancing point cloud feature representation through attention-weighted mechanisms. A residual mechanism is also introduced for semantic feature compensation, effectively improving the semantic guidance in cross-modal feature learning. Additionally, a self-supervised cross-modal learning strategy is introduced, incorporating 3D points contrastive learning with 2D image semantic features guidance to achieve both inter-modal and intra-modal fine-grained feature association, thereby enhancing the adaptability of feature learning. Finally, by reconstructing the model in the feature spaces of images and point clouds, and leveraging a joint optimization mechanism of reconstruction loss, contrastive loss, and cross-modal consistency loss, the learning performance of the network is significantly improved. Experimental results demonstrate that the proposed method improves the information interaction in cross-modal learning and enhances the robustness of feature learning through image semantic guidance. Evaluated via linear probing, this method achieves 91.61% (classification) and 86.4% (segmentation) on 3D shape tasks, outperforming the baseline by 5.37 percentage points and 1.2 percentage points on average.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Local Correspondence Aware Method for Cross-Modal Learning on Point Clouds

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content