高级检索
刘光辉, 占华, 孟月波, 王博, 王博. 多尺度显著特征双线注意力细粒度分类方法[J]. 计算机辅助设计与图形学学报.
引用本文: 刘光辉, 占华, 孟月波, 王博, 王博. 多尺度显著特征双线注意力细粒度分类方法[J]. 计算机辅助设计与图形学学报.
Guanghui LIU, Hua ZHAN, Yuebo MENG, Bo WANG, Bo WANG. Multi-scale Salient Features Bilinear Attention Fine-grained Classification Method[J]. Journal of Computer-Aided Design & Computer Graphics.
Citation: Guanghui LIU, Hua ZHAN, Yuebo MENG, Bo WANG, Bo WANG. Multi-scale Salient Features Bilinear Attention Fine-grained Classification Method[J]. Journal of Computer-Aided Design & Computer Graphics.

多尺度显著特征双线注意力细粒度分类方法

Multi-scale Salient Features Bilinear Attention Fine-grained Classification Method

  • 摘要: 针对细粒度图像分类任务中存在的区分性特征太过细微难以捕捉、无法有效定位感兴趣的区域等问题, 提出一种多尺度显著特征双线注意力分类方法. 首先设计区域显著特征增强模块, 通过区域切片操作放大并捕获细微可区分特征, 增强特征图表达能力; 然后提出多分支双线注意力池化策略, 以弱监督方式层次化表征对象的显著部位特征, 提高不同尺度局部信息的关注能力; 最后利用反事实学习思想量化注意力质量, 将真实学到的注意力和无关注意力对最终预测结果的差异作为衡量指标, 通过差异最大化迫使双线注意力池化策略学习更有效特征. 在CUB-200-2011, Stanford Cars和Stanford Dogs这3个公开数据集上, 所提方法的准确率分别达到89.3%, 95.0%和87.6%, 比其他先进方法的性能有较大幅度地提升.

     

    Abstract: In fine-grained image classification, to cope with the difficulty in capturing the subtle discriminative features and locating the region of interest, a multi-scale salient feature bilinear attention classification method was proposed. Firstly, the region patch feature boosting module (RPFBM) was designed. The expression ability of the feature maps was enhanced by region slicing operation which could enlarge and capture fine distinguishable features; Then, multi-branch bilinear attention pooling strategy (MBAP) is proposed, which hierarchically representsed the features of the salient parts of the image with weak supervision mode and improved the attention ability of local information of different scales; Finally, the counterfactual learning was employed to quantify the attention quality. The difference between real learned attention and irrelevant attention on the final prediction results were taken as a measurement index, and therefore the bilinear attention pooling strategy was forced to learn more effective features by maximizing the difference. The accuracy rates of the method proposed in the paper respectively reach 89.3%, 95.0% and 87.6% in three public datasets, CUB-200-2011, Stanford Cars and Stanford Dogs, respectively, which indicates a significant improvement in performance compared with other advanced methods.

     

/

返回文章
返回