结合多分支结构与门控机制的高分辨率语义分割方法
A High Resolution Semantic Segmentation Method via Multi-branch Structure and Gating Mechanism
-
摘要: 由于HRNetv2等多分支结构网络在语义分割任务中无法有效融合多层次特征的问题, 为此, 提出一种基于门控机制的新型多层次特征融合方法. 首先, 构建门控融合单元, 利用门控机制有选择性地融合多个分支的特征信息; 其次, 提出自底向上的融合方法, 通过阶梯式地传播语义丰富的高级特征与细节饱满的低级特征来增强每一条分支的特征表示; 最后将各个分支的特征在通道维度进行拼接, 获得预测输出并采用双线性插值算法恢复至原图像尺寸. 实验结果表明, 仅需增加少量参数, 该方法在PASCAL VOC 2012 + Aug和Cityscapes数据集分别取得77.01%mIoU和80.43%mIoU, 相较于HRNetv2-W48分别提升了1.14%mIoU和1.92%mIoU, 同时性能超越诸多基线模型.Abstract: In order to solve the problem that HRNetv2 and other multi-branch structure networks cannot effectively fuse multi-level features in semantic segmentation tasks, a new multi-level feature fusion method based on the gating mechanism is proposed. Firstly, a gated fusion unit is constructed to fuse the feature information of multiple branches selectively. Secondly, a bottom-up fusion method is adopted to progressively enhance the feature representation of each branch by means of spreading semantically high-level features and detailed low-level features. Finally, features of branches are concatenated channel-wisely to output the predicted mask, and the bilinear interpolation algorithm is used to restore the original image size. Experiments show that the proposed method with only a few parameters achieves 77.01% mIoU and 80.43% mIoU in PASCAL VOC 2012 + Aug and Cityscapes respectively, increases by 1.14% mIoU and 1.92mIoU compared with HRNetv2-W48, and outperforms many baseline models.