Scene Text Recognition with Eliminating Background Noise and Enhancing Characters Shape Feature
-
Graphical Abstract
-
Abstract
A text recognition model that eliminates background noise and enhances the shape features of characters was proposed to solve the problem that the existing methods cannot effectively eliminate the background noise and there is noise interference of the characters themselves. The model consisted of three modules. The EBEC network enhanced by the spatial attention mechanism only paid attention to character region features, eliminated background noise, and forced the network to learn only the character shape features to enhance the character shape features; the feature extraction module extracted feature maps by using EfficientNet-B3 as the backbone network; the primitive representation learning module learned the feature map to obtain the visual text representation and then acquired the recognition result by decoding the visual text representation. The experimental results show that the proposed model improves the recognition accuracy by 9.76 percentage points over the classical model on the synthetic scene dataset and by 2.91 percentage points on average on the public datasets IIIT5K, ICDAR-2003, ICDAR-2015, CUTE80. Therefore, the model can not only effectively eliminate background noise and character noise, but also improve recognition performance.
-
-