Advanced Search
Song Lifei, Weng Liguo, Wang Lingfeng, Xia Min. Multi-scale 3D Convolution Fusion Two-Stream Networks for Action Recognition[J]. Journal of Computer-Aided Design & Computer Graphics, 2018, 30(11): 2074-2083. DOI: 10.3724/SP.J.1089.2018.17068
Citation: Song Lifei, Weng Liguo, Wang Lingfeng, Xia Min. Multi-scale 3D Convolution Fusion Two-Stream Networks for Action Recognition[J]. Journal of Computer-Aided Design & Computer Graphics, 2018, 30(11): 2074-2083. DOI: 10.3724/SP.J.1089.2018.17068

Multi-scale 3D Convolution Fusion Two-Stream Networks for Action Recognition

  • Action recognition technology based on videos has been widely used in the field of computer vision.The existing networks cannot effectively combine the spatio-temporal information of video data and lacks consideration of fusion information between different scale data.This paper proposes a multi-scale 3D convolution fusion two-stream network that combines the two-stream network and the 3D convolution neural network.Firstly,the spatial and temporal dimension information of videos are obtained by using 2D residual networks and multi-scale 3D convolution fusion networks.Then,experimental results of the two networks are combined with fusion,to effectively improve the ability of the network to extract the spatio-temporal features of videos.Finally,the generalization ability of the network to different scale data is improved by the fusion of different strategies in multi-scale 3D convolution fusion network.The model was experimented and test in the data set of UCF-101 and HMDB-51,the experimental results were 90.5%and 66.3%,compared with other algorithms,the proposed model can achieve higher recognition accuracies and embody the superiority and the robustness of the algorithm.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return