高级检索
孙志勇, 叶俊勇, 汪同庆, 雷莉, 连捷, 李阳. 基于动态多任务平衡方法的行人属性识别深度学习网络[J]. 计算机辅助设计与图形学学报, 2019, 31(12): 2144-2151. DOI: 10.3724/SP.J.1089.2019.17654
引用本文: 孙志勇, 叶俊勇, 汪同庆, 雷莉, 连捷, 李阳. 基于动态多任务平衡方法的行人属性识别深度学习网络[J]. 计算机辅助设计与图形学学报, 2019, 31(12): 2144-2151. DOI: 10.3724/SP.J.1089.2019.17654
Sun Zhiyong, Ye Junyong, Wang Tongqing, Lei Li, Lian Jie, Li Yang. Deep Learning Network for Pedestrian Attribute Recognition Based on Dynamic Multi-Task Balancing[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(12): 2144-2151. DOI: 10.3724/SP.J.1089.2019.17654
Citation: Sun Zhiyong, Ye Junyong, Wang Tongqing, Lei Li, Lian Jie, Li Yang. Deep Learning Network for Pedestrian Attribute Recognition Based on Dynamic Multi-Task Balancing[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(12): 2144-2151. DOI: 10.3724/SP.J.1089.2019.17654

基于动态多任务平衡方法的行人属性识别深度学习网络

Deep Learning Network for Pedestrian Attribute Recognition Based on Dynamic Multi-Task Balancing

  • 摘要: 深度学习网络是计算机视觉和人工智能系统的研究热点之一,行人属性识别提供了结构化的行人特征,为安防计算机视觉识别中行人检索提供了重要的信息.基于深度学习网络,提出了一种端到端的多属性识别方法,在R*CNN的基础上设计了一个端到端的行人属性识别网络,使用候选区域提取网络代替Selective Search提取第二重要的区域,建立属性识别与辅助区域提取一体化的网络,提升局部及细节属性识别的准确率;其次,为增加辅助区域的作用,将人体感兴趣区域按比例划分为整体、头、肩膀到腰及腰到脚4个部分,每个部分对应了不同属性,在任务分支层分出4个分支,使用主要区域预测对应属性的同时,分别从RPN中学习到对应的第二重要区域辅助预测;最后,提出了基于损失梯度的损失权值自动更新方法,即权重与损失的梯度逆相关,防止某个任务训练的过快或过慢.通过在行人属性数据库进行实验,整体提升了属性预测的准确率,大大缩短了识别时间.

     

    Abstract: Person attribute recognition extracts structured feature of person,which plays a vital role in intelligent video surveillance,such as person re-identification.Firstly,based on R*CNN,we design an end-to-end multi-attribute recognition method based on deep learning network.The region proposal network(RPN)rather than selective search is employed to extract auxiliary regions.An unified network for auxiliary region extraction and attribute recognition is constructed to improve locally attributes.Secondly,in order to enhance the effects of auxiliary region,we split the body ROI into four regions proportionately,such as whole body,head,torso and leg.Each region is in charge of different attributes.And the network splits into four branches at the prediction stage.The primary regions and the second important auxiliary regions are exploited to predict attributes simultaneously.At last,the dynamic adapting loss weighting has the ability to balance the contribution of every task and achieve an optimum performance.That is,the loss weights are inversely correlated with the gradient of loss function,which is to avoiding a certain task is training too fast or too slow.The comparison experiments are elaborated on the Berkeley Attributes of People dataset,an optimum mean average precision(mAP)more than 92%is obtained when compared with state-of-the-art methods.

     

/

返回文章
返回