自适应融合多尺度特征的无锚框遥感图像目标检测算法
Anchor-free Object Detection Method in Remote Sensing Image via Adaptive Multi-scale Feature Fusion
-
摘要: 针对遥感图像内容丰富且复杂, 具有目标种类多、密集分布和尺寸变化剧烈等特点, 导致遥感图像中目标多尺度尤其小目标难以检测的问题, 提出一种基于自适应多尺度特征融合(AMFF)和注意力特征增强(AFE)的无锚框遥感图像目标检测算法. 首先将主干网络提取的图像特征输入AMFF, 自适应地融合多个尺度的特征, 增加特征复用, 提升网络的多尺度特征表达能力; 然后将AMFF输出的特征输入到加入了AFE模块的检测头中, AFE通过结合多分支空洞卷积与注意力机制, 在提高网络对目标尺度的泛化能力的同时增强有效特征信息; 最后进行分类和回归得到检测结果. 在DIOR和NWPU VHR-10公开数据集上与多种主流目标检测算法的实验表明, 所提算法在2个数据集上的平均检测精度分别为72.4%和87.4%, 较基线网络提升了9.4%和13.5%, 对比次优结果提升了6.3%和1.7%; 平均检测精度高于主流目标检测算法, 较基线网络的平均检测精度显著提高, 能够更加准确地检测小尺度目标, 同时有效地提升多尺度目标的检测精度.Abstract: The characteristics of many types, dense distribution and the difference in scales of objects in remote sensing images will result in small objects difficult to be detected. Therefore, a remote sensing image anchor-free object detection method based on an adaptive multi-scale feature fusion (AMFF) and attention feature enhancement (AFE) mechanism is proposed in this paper. First of all, the image features extracted by the backbone network are input into AMFF, which adopts an adaptive multi-scale feature fusion module to enhance feature reuse, so it can enrich feature information and enhance the multi-scale feature expression ability of the network. Afterwards, the output of the features from AMFF is input into the detection head with AFE. AFE combines multi-branch dilated convolution and attention mechanism to enhance both the network's multi-scale generalization ability of the object and the effective feature information. Finally, the detection results are obtained by classification and regression. Experiments with a variety of mainstream object detection algorithms on DIOR and NWPU VHR-10 public datasets show that the average detection accuracy of the proposed algorithm is 72.4% and 87.4%, which is 9.4% and 13.5% higher than that of the baseline network and 6.3% and 1.7% higher than that of the suboptimal results. The results demonstrate that the average detection accuracy is higher than that of the mainstream object detection algorithms. Meanwhile, the average detection accuracy of the baseline network is significantly improved, which can detect small-scale objects more accurately and effectively improve the detection accuracy of multi-scale objects.