高级检索
严成良, 陈光柱, 易佳, 苟荣松, 廖晓鹃. 融合多尺度与注意力机制的智能车间场景目标轻量级语义分割[J]. 计算机辅助设计与图形学学报, 2022, 34(10): 1626-1636. DOI: 10.3724/SP.J.1089.2022.19378
引用本文: 严成良, 陈光柱, 易佳, 苟荣松, 廖晓鹃. 融合多尺度与注意力机制的智能车间场景目标轻量级语义分割[J]. 计算机辅助设计与图形学学报, 2022, 34(10): 1626-1636. DOI: 10.3724/SP.J.1089.2022.19378
Yan Chengliang, Chen Guangzhu, Yi Jia, Gou Rongsong, Liao Xiaojuan. Lightweight Semantic Segmentation of Intelligent Workshop Scene Objects Combining Multi-Scale and Attention Mechanisms[J]. Journal of Computer-Aided Design & Computer Graphics, 2022, 34(10): 1626-1636. DOI: 10.3724/SP.J.1089.2022.19378
Citation: Yan Chengliang, Chen Guangzhu, Yi Jia, Gou Rongsong, Liao Xiaojuan. Lightweight Semantic Segmentation of Intelligent Workshop Scene Objects Combining Multi-Scale and Attention Mechanisms[J]. Journal of Computer-Aided Design & Computer Graphics, 2022, 34(10): 1626-1636. DOI: 10.3724/SP.J.1089.2022.19378

融合多尺度与注意力机制的智能车间场景目标轻量级语义分割

Lightweight Semantic Segmentation of Intelligent Workshop Scene Objects Combining Multi-Scale and Attention Mechanisms

  • 摘要: 对场景目标进行语义级分割与识别是实现车间场景中移动机器人智能导航、智能安防的基础.针对车间场景中目标语义分割存在分割目标种类多、形状差异大等多尺度问题,为满足智能车间目标分割实时性的要求,提出一种融合双路平均池化与三分支注意力机制的轻量级语义分割网络;首先,采用编码器-解码器结构,以轻量级卷积神经网络作为整个网络的编码器,解码器包括双路平均池化模块和三分支注意力机制模块,提取多尺度目标的语义信息和实现高精度的语义分割;然后选取ShuffleNet v2,SqueezeNet和MobileNet v2这3种不同的轻量级卷积神经网络与解码器结合,针对智能车间场景目标语义分割数据集,进行目标语义分割对比实验,并确定MobileNet v2作为编码器.与ENet,ERFNet,BiSeNet v2,Deeplab v3,Deeplab v3+,CFPNet和Fast-SCNN轻量级语义分割网络进行语义分割精度和实时性的实验结果表明,所提网络对车间场景目标分割的MPA为94.25%,MIoU为87.67%,浮点运算数量为109×1.66,推理速度为66.67帧/s,能够很好地平衡分割精度与实时性,满足智能车间场景目标语义分割的需求.

     

    Abstract: Semantic-level segmentation and recognition of scene objects is the basis for realizing intelligent navigation and intelligent security of mobile robots in workshop scenes. Aiming at the multi-scale problems of object semantic segmentation in workshop scenes, such as many types of segmentation objects and large differences in shape, in order to meet the real-time requirements of intelligent workshop object segmentation, an integrating double average pooling and three branch attention mechanism network is proposed. First,an encoder-decoder structure is adopted, and a lightweight convolutional neural network is used as the encoder of the entire network. The decoder includes a two-way average pooling module and a three-branch attention mechanism module to extract the semantic information of multi-scale objects and achieve high-precision semantic segmentation;then, three different lightweight convolutional neural networks,ShuffleNet v2, SqueezeNet, and MobileNet v2, are selected, combined with the decoder, through the object semantic segmentation comparison experiment for the object semantic segmentation dataset of the intelligent workshop scene, MobileNet v2 is determined as the encoder. The experimental results of semantic segmentation accuracy and semantic segmentation real-time performance with lightweight semantic segmentation networks such as ENet, ERFNet, BiSeNet v2, Deeplab v3, Deeplab v3+, CFPNet and Fast-SCNN show that the proposed network achieves 94.25% MPA, 87.67% MIoU, 109×1.66 FLOPs, and 66.67 frame per secondinference speed for object segmentation in workshop scenes, which can well balance the segmentation accuracy and real-time performance, and meet the needs of object semantic segmentation in intelligent workshop scenes.

     

/

返回文章
返回