高级检索

利用对抗网络改进多标记图像分类

Improve Multi-Label Image Classification Using Adversarial Network

  • 摘要: 为了更有效地对多标记图像进行分类,提出一个改进的卷积神经网络模型,通过融合多层次特征并利用空间金字塔池化来学习多标记图像中的多尺度特征,同时设计对抗网络生成新的样本辅助模型训练.首先,对传统卷积神经网络模型进行改进,利用空间金字塔池化层替换网络的最后一层,并将在ImageNet上预先训练好的参数传递给该模型;然后,通过将深层特征和浅层特征进行融合,使得模型对不同尺度的物体具有更好的识别能力;最后,设计了一个对抗网络生成带遮挡的样本,使模型对遮挡物体的识别也具有良好的鲁棒性.实验测试在2个基准数据集上进行,文中模型在Corel5K数据集上的平均查准率和平均查全率分别为0.457和0.427,mAP值达到0.442,而在PASCAL VOC 2012数据集上的mAP值则达到0.85.实验结果表明,与当前国际先进的模型相比,该模型具有更好的有效性和更强的鲁棒性.

     

    Abstract: In order to classify multi-label images more effectively,an improved convolution neural network model is proposed.The model learns multi-scale features in multi-label images by fusing multi-level features and utilizing spatial pyramid pooling.At the same time,an adversarial network is designed to generate new samples to assist model training.Firstly,the traditional convolution neural network model is improved,and the last layer of the network is replaced with the spatial pyramid pooling layer.In addition,the pre-trained parameters on ImageNet are transfered to the model.Then,the deep and shallow features are fused so that the model can acquires better recognition ability for multi-scale objects.Finally,an adversarial network is designed to generate samples with occlusion,therefore the model is also robust to recognize objects with occlusion.Experiments are carried out on two benchmark datasets.The average precision and recall of the proposed model on Corel5K dataset are 0.457 and 0.427,respectively.The mAP value on Corel5K dataset attains 0.442,while the mAP value on PASCAL VOC 2012 dataset attains 0.85.The experimental results show that the proposed model has better effectiveness and stronger robustness than many state-of-the-art models.

     

/

返回文章
返回