高级检索

类别均衡与局部中值损失联合监督的自然场景人脸表情识别

Class-Balanced and Local Median Loss Jointly Supervised for Wild Facial Expression Recognition

  • 摘要: 近年来,卷积神经网络在实验室控制环境下的人脸表情识别任务中取得了很大进步,但是在自然场景中人脸表情识别方面仍然存在一些挑战.针对自然场景中人脸表情数据分布不平衡,以及由姿势、光照和性别等因素引起的类内差异大的问题,提出类别均衡与局部中值(class-balanced and local median,CALM)损失函数.CALM损失函数包含类别均衡Softmax损失函数和局部中值损失函数2个部分.其中,类别均衡Softmax损失函数将数据量较少且容易错分的害怕和厌恶2种表情标记为难样本,将其余5种表情标记为易样本;在网络训练过程中对难样本自适应地增大权重,以提高难样本的识别准确率,进而提高表情识别的平均准确率.此外,在每个类别中会有一些离类别内大多数样本较远的样本,它们的存在会导致用均值方法计算出的类别中心偏离类内大多数样本.在局部中值损失函数中,采用与每个样本属于同类别的若干近邻的中值作为类别中心,在一定程度上减弱离群样本对类别中心选择的影响.在RAF(real-world affective faces)数据集上进行实验,与局部子类方法相比,该方法的平均识别准确率提升了1.32%,证明了该方法的有效性.

     

    Abstract: Over the past few years,convolutional neural networks have shown effective performance on laboratory-controlled facial expression recognition.However,it is still a challenge problem for facial expression recognition in the wild.In this paper,a loss function—CALM loss(class-balanced and local median)is proposed to solve the problem of imbalanced data for wild facial expression recognition and large intra-class variation caused by posture,illumination and gender.The CALM loss includes two parts:the class-balanced Softmax loss function and the local median loss function.The class-balanced Softmax loss function marks the two expressions of fear and disgust,which have a small amount of data and are prone to misclassification,as difficult samples,and the other five expressions as easy samples.During the network training,the weight of difficult samples is adaptively increased to improve the recognition accuracy of difficult samples,so as to improve the average accuracy of expression recognition.In addition,there are some samples in each category that are far away from the majority of the samples in the category,and their existence will cause the center of the category calculated by the mean method to deviate from the majority of the samples in the category.The local median loss function uses the median value of several neighbors that belong to the same category as each sample as the class center,which can reduce the impact of outlier samples on the choice of category center to a certain extent.The average recognition accuracy on the RAF(real-world affective faces)dataset was improved by 1.32%compared with local subclass method,which proves the effectiveness of the proposed method.

     

/

返回文章
返回