高级检索
何自芬, 黄俊璇, 张印辉, 朱守业. 自适应调控卷积与双路信息嵌入的城市街景实例分割[J]. 计算机辅助设计与图形学学报, 2023, 35(7): 1086-1096. DOI: 10.3724/SP.J.1089.2023.19561
引用本文: 何自芬, 黄俊璇, 张印辉, 朱守业. 自适应调控卷积与双路信息嵌入的城市街景实例分割[J]. 计算机辅助设计与图形学学报, 2023, 35(7): 1086-1096. DOI: 10.3724/SP.J.1089.2023.19561
He Zifen, Huang Junxuan, Zhang Yinhui, Zhu Shouye. Traffic Street Scene Instance Segmentation Based on Adaptive Regulatory Convolution and Dual-Path Information Embedding[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(7): 1086-1096. DOI: 10.3724/SP.J.1089.2023.19561
Citation: He Zifen, Huang Junxuan, Zhang Yinhui, Zhu Shouye. Traffic Street Scene Instance Segmentation Based on Adaptive Regulatory Convolution and Dual-Path Information Embedding[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(7): 1086-1096. DOI: 10.3724/SP.J.1089.2023.19561

自适应调控卷积与双路信息嵌入的城市街景实例分割

Traffic Street Scene Instance Segmentation Based on Adaptive Regulatory Convolution and Dual-Path Information Embedding

  • 摘要: 城市街道场景实例分割是无人驾驶不可忽略的关键技术之一,针对城市街景实例密集、边缘模糊以及背景干扰严重等问题,提出一种自适应调控卷积与双路信息嵌入的城市街景实例分割模型RENet.首先使用自适应调控卷积替代原有的残差结构,利用可变形卷积学习空间采样位置偏移量,提高模型对图像复杂形变的建模能力,同时对多分支结构进行通道混洗以加强不同通道间的信息流动,并应用注意力机制实现通道权重的自适应校准,提高模型对复杂场景下模糊、密集目标的分割精度;然后设计低维空间信息嵌入分支,对不同尺度特征图进行空间信息激励与重编码,在抽象语义特征中嵌入低维空间信息,提高模型轮廓分割准确性;最后引入高级语义信息嵌入模块,实现特征图与语义框的对齐,弥补特征图间语义与分辨率的差距,提高不同尺度下特征信息融合的有效性.在自建数据集上的实验结果表明,与原始YOLACT网络模型相比,RENet模型在复杂街道背景下的平均分割精度最高达到51.6%,提高了10.4个百分点;网络推理速度达到17.5帧/s,验证了该模型的有效性和在工程中的实用性.

     

    Abstract: Instance-level segmentation of street scene is a key technology that cannot be ignored in unmanned driving. Aiming at the problems of dense instances in urban street scene, blurred edges and serious background interference, a segmentation model RENet based on adaptive regulation convolution and dual-path information embedding was proposed. Firstly, adaptive regulatory convolution was used to replace the original residual structure, and deformable convolution learning space sampling position offset was used to improve the modeling ability of the model for complex image deformation. At the same time, channel mixing was carried out on the multi-branch structure to enhance the information flow between different channels, and the attention mechanism was applied to realize the adaptive calibration of channel weights. Improve the segmentation accuracy of the model for fuzzy and dense objects in complex scenes. Secondly, low dimensional spatial information embedding branches were designed, spatial information excitation and recoding were carried out on different scale feature maps, and low dimensional spatial information was embedded into abstract semantic features to improve the accuracy of model contour segmentation. Finally, a high-level semantic information embedding module is introduced to align the feature map with the semantic box, bridge the semantic and resolution gap between the feature maps, and improve the effectiveness of feature information fusion at different scales. The test results on the self-built data set show that compared with the original YOLACT network model, the average segmentation accuracy of RENet under the complex street background is up to 51.6%, which is 10.4 percentage point higher. At the same time, the network reasoning speed reaches 17.5 frames/s, which verifies the effectiveness of the model optimization and the practicability of the engineering value.

     

/

返回文章
返回