高级检索
蒲正东, 陈姝, 邹北骥, 蒲保兴. 基于高分辨率网络的自监督单目深度估计方法[J]. 计算机辅助设计与图形学学报, 2023, 35(1): 118-127. DOI: 10.3724/SP.J.1089.2023.19299
引用本文: 蒲正东, 陈姝, 邹北骥, 蒲保兴. 基于高分辨率网络的自监督单目深度估计方法[J]. 计算机辅助设计与图形学学报, 2023, 35(1): 118-127. DOI: 10.3724/SP.J.1089.2023.19299
PU Zheng-dong, CHEN Shu, ZOU Bei-ji, PU Bao-xing. A Self-Supervised Monocular Depth Estimation Method Based on High Resolution Convolutional Neural Network[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(1): 118-127. DOI: 10.3724/SP.J.1089.2023.19299
Citation: PU Zheng-dong, CHEN Shu, ZOU Bei-ji, PU Bao-xing. A Self-Supervised Monocular Depth Estimation Method Based on High Resolution Convolutional Neural Network[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(1): 118-127. DOI: 10.3724/SP.J.1089.2023.19299

基于高分辨率网络的自监督单目深度估计方法

A Self-Supervised Monocular Depth Estimation Method Based on High Resolution Convolutional Neural Network

  • 摘要: 使用深度学习方法进行单目深度估计时,由于使用多级下采样会出现重建结果细节信息缺失、边缘轮廓模糊等问题.为此,提出一种基于高分辨率网络的自监督单目深度估计方法.首先,通过并行连接使得特征图在编码过程中始终保持高分辨率表示,以充分地保留细节信息;其次,为了提高编码器的学习能力,在编码部分引入注意力模块,对图像特征进行筛选和提炼;最后,针对深度估计的多义性问题,利用非相邻帧图像之间的一致性,设计了一种有效的损失函数,并使用可靠性掩膜来消除动点和遮挡点的干扰.在TensorFlow框架下采用KITTI和Cityscapes数据集进行实验,实验结果表明,与已有深度估计方法相比,该方法不仅能够保留预测深度的边缘信息,而且能够提高预测深度的准确性,可达到0.119的平均相对误差.

     

    Abstract: Due to the multiple down sampling operation, deep learning based monocular depth estimation always results in some problems such as lack of detailed information and blurred reconstructed edge contours.For these issues, a self-supervised monocular depth estimation method based on high resolution convolutional network is proposed. Firstly, to recover the detailed information, this work introduces the parallel connection in the encoder to keep the feature map in high-resolution. Secondly, in order to improve the learning ability of the encoder, the attention module is introduced to filter and refine the image features. Finally, aiming at the ambiguity of depth estimation, this work designs an effective loss function according to the consistency between non-adjacent frame images, and uses a mask to eliminate the interference of moving points and occluded points. We implement the depth estimation system based on the TensorFlow framework,and the training is carried out on the KITTI and Cityscapes datasets. Through comparative analysis with proposed depth estimation methods, the results show that this method can not only maintain the edge information of the predicted depth, but also improve the accuracy of the predicted depth, with an average relative error of 0.119.

     

/

返回文章
返回