Multi-scale 3D Convolution Fusion Two-Stream Networks for Action Recognition

Song Lifei; Weng Liguo; Wang Lingfeng; Xia Min

doi:10.3724/SP.J.1089.2018.17068

Song Lifei, Weng Liguo, Wang Lingfeng, Xia Min. Multi-scale 3D Convolution Fusion Two-Stream Networks for Action Recognition[J]. Journal of Computer-Aided Design & Computer Graphics, 2018, 30(11): 2074-2083. DOI: 10.3724/SP.J.1089.2018.17068

Citation:

Multi-scale 3D Convolution Fusion Two-Stream Networks for Action Recognition

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Action recognition technology based on videos has been widely used in the field of computer vision.The existing networks cannot effectively combine the spatio-temporal information of video data and lacks consideration of fusion information between different scale data.This paper proposes a multi-scale 3D convolution fusion two-stream network that combines the two-stream network and the 3D convolution neural network.Firstly,the spatial and temporal dimension information of videos are obtained by using 2D residual networks and multi-scale 3D convolution fusion networks.Then,experimental results of the two networks are combined with fusion,to effectively improve the ability of the network to extract the spatio-temporal features of videos.Finally,the generalization ability of the network to different scale data is improved by the fusion of different strategies in multi-scale 3D convolution fusion network.The model was experimented and test in the data set of UCF-101 and HMDB-51,the experimental results were 90.5%and 66.3%,compared with other algorithms,the proposed model can achieve higher recognition accuracies and embody the superiority and the robustness of the algorithm.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Multi-scale 3D Convolution Fusion Two-Stream Networks for Action Recognition

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content