高级检索
石跃祥, 曾智超. 基于特征传播的时域分割网络行为识别[J]. 计算机辅助设计与图形学学报, 2020, 32(4): 582-589. DOI: 10.3724/SP.J.1089.2020.17652
引用本文: 石跃祥, 曾智超. 基于特征传播的时域分割网络行为识别[J]. 计算机辅助设计与图形学学报, 2020, 32(4): 582-589. DOI: 10.3724/SP.J.1089.2020.17652
Shi Yuexiang, Zeng Zhichao. Temporal Segment Networks Based on Feature Propagation for Action Recognition[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(4): 582-589. DOI: 10.3724/SP.J.1089.2020.17652
Citation: Shi Yuexiang, Zeng Zhichao. Temporal Segment Networks Based on Feature Propagation for Action Recognition[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(4): 582-589. DOI: 10.3724/SP.J.1089.2020.17652

基于特征传播的时域分割网络行为识别

Temporal Segment Networks Based on Feature Propagation for Action Recognition

  • 摘要: 为了高效、准确地获得视频中的行为类别和运动信息,减少计算的复杂度,文中提出一种融合特征传播和时域分割网络的视频行为识别算法.首先将视频分为3个小片段,分别从相应片段中提取关键帧,从而实现对长时间视频的建模;然后设计一个包含特征传播表观信息流和FlowNet运动信息流的改进时域分割网络(P-TSN),分别以RGB关键帧、RGB非关键帧、光流图为输入提取视频的表观信息流和运动信息流;最后将改进时域分割网络的BN-Inception描述子进行平均加权融合后送入Softmax层进行行为识别.在UCF101和HMDB51这2个数据集上分别取得了94.6%和69.4%的识别准确率,表明该算法能够有效地获得视频中空域表观信息和时域运动信息,提高了视频行为识别的准确率.

     

    Abstract: In order to extract human action category and motion information efficiently and reduce the computational complexity from the video,an algorithm combining feature propagation and temporal segment networks for action recognition is proposed.Firstly,the video is divided into three small segments,and the key frames are extracted from the corresponding segments to implement the modeling of long-term video.Secondly,designing a propagation of temporal segment networks(P-TSN)that includes the appearance information and motion information,using feature propagation and FlowNet respectively and takes the RGB key frames,RGB non-key frames,and the optical flow images as input to extract the appearance information and motion information of the video.Finally,the BN-Inception descriptors of the improved temporal segment networks are averagely weighted and sent to the Softmax layer for action recognition.The experiments on UCF101 and HMDB51 datasets have obtained recognition accuracy of 94.6%and 69.4%respectively,indicating that the proposed algorithm can improves the accuracy of video action recognition by using the spatial information and the temporal motion information fully.

     

/

返回文章
返回