Self-Supervised Depth Estimation of Monocular Thermal Images based on Feature Transformation
-
Graphical Abstract
-
Abstract
Self-supervised depth estimation of thermal images based on the encoding-decoding structure has demonstrated high accuracy and reliability in complex scenes. However, due to the physical limitations of thermal images, such as low signal-to-noise ratio and limited contrast, thermal image depth estimation methods based on the encoder-decoder structure face challenges in generating effective supervisory signals during network training and accurately identifying long-distance information, affecting the precision of the depth map. To address these issues, we proposed a novel self-supervised monocular depth estimation method based on feature transformation. The core idea is an encoding-decoding structure based on feature transformation. First, the raw thermal image features extracted by the encoder are transformed and linearly combined to produce an uncertain feature map, enhancing the network’s sensitivity to features and improving the model’s adaptability to complex scenes. Then, this study designed a new normalization module to optimize the decoder for effectively utilizing feature information and generating high-quality depth maps. we tested the proposed method on the ViViD++ dataset and compared it with state-of-the-art methods. Experimental results show that the proposed method reduces the absolute error by 12%, the relative error by 10%, and the root mean square error by 11%. As a result, we can conclude that the proposed method has high accuracy, not only improving the quality of depth maps but also exhibiting strong robustness.
-
-