Fine-Grained Image Recognition via Multi-Part Learning
-
Graphical Abstract
-
Abstract
The existing methods mainly use attention to locate the subtle parts. However, convolutional neural networks (CNNs), which employ the cross-entropy loss as the loss function, can only learn the most discriminative part and ignore other meaningful regions. In this paper, a novel fine-grained image recognition method via multipart learning (MPL) is presented. Firstly, a parameter-free data augmentation method named Semantic Patch Mix is proposed, which improves the networks' generalization performance on the test distribution and robustness to the sensitivity to input perturbations by exchanging the most discriminative part of the image. Secondly, a parameter-free multi-part adversarial erasing module is proposed, which erases the most discriminative region under the guidance of attention and Bernoulli distribution to force the network to discover other discriminative regions of the object. The attention guidance ensures that the erased regions are sufficiently discriminative, and the Bernoulli distribution guidance ensures that the erased regions are diverse. Finally, mid-level features are incorporated to further improve performance. The proposed method is model-agnostic and thus can serve as a plug-and-play module to be applied to various backbone networks. Taking ResNet-50 as the backbone network, the classification accuracy of the proposed method on three public data sets CUB-200-2011, FGVC-Aircraft and Stanford Cars reached 89.2%, 95.5% and 94.0% respectively. Experimental results show that the proposed method, which can discover more discriminative parts, outperforms state-of-the-art approaches.
-
-