双路径特征融合编解码结构的高速语义分割

胡学刚; 龚宇; 敬力源

doi:10.3724/SP.J.1089.2022.19255

双路径特征融合编解码结构的高速语义分割

High-Speed Semantic Segmentation Based on Dual-Path Feature Fusion Codec Structure

摘要

摘要: 对基于深度学习的高精度图像语义分割模型参数量大、分割速度慢的问题,提出一种基于双路径特征融合编解码结构的语义分割模型.首先,该模型编码器通过对语义路径和空间路径同时进行编码,其能够融合不同的特征信息,弥补了空间信息和语义信息难以两全的弊端,对特征图进行高效的卷积操作;其次,该模型解码器通过融合高层语义信息和低层空间信息,有效地弥补了编码时下采样操作丢失的特征信息.在Cityscapes和Camvid数据集上的实验结果表明,整体模型的参数量仅为3.91×10⁶,在2个数据集上分别取得了67.7%和65.8%的均交并比,分割速度分别为111帧/s和86帧/s.对比其他同类模型,所提模型拥有更少的参数量和更高的精度,其分割速度远远超过实时语义分割的最低要求24帧/s.

Abstract: To address the problem of large parameters and slow segmentation speed in high-precision image semantic segmentation, a semantic segmentation deep learning-based model is proposed. By encoding both semantic and spatial paths, the model is able to fuse different feature information and compensate the disadvantage that it is difficult to combine spatial and semantic information, so that the feature map can be convolved efficiently. Furthermore, this model decoder merges high-level semantic information and low-level spatial information, to effectively compensate for the loss of feature information via the down-sampling operation during encoding. The experimental results on the Cityscapes and Camvid datasets show that the parameter of the model is only 3.91×10⁶, with mean intersection over union of 67.7% and 65.8% on the two datasets respectively. Also, the segmentation speeds are 111 FPS and 86 FPS, respectively.Compared with some similar models, the proposed model has fewer parameters and higher accuracy, and its segmentation speed significantly exceeds the minimum frames per second(24 FPS) required for real-time semantic segmentation.

HTML全文

参考文献(30)

施引文献

资源附件(0)