Spatiotemporal Neural Network for Video-Based Pose Estimation

Ji Bin; Pan Ye; Jin Xiaogang; Yang Xubo

doi:10.3724/SP.J.1089.2022.18878

Ji Bin, Pan Ye, Jin Xiaogang, Yang Xubo. Spatiotemporal Neural Network for Video-Based Pose Estimation[J]. Journal of Computer-Aided Design & Computer Graphics, 2022, 34(2): 189-197. DOI: 10.3724/SP.J.1089.2022.18878

Citation:

Spatiotemporal Neural Network for Video-Based Pose Estimation

Graphical Abstract

Graphical Abstract

Abstract

Abstract

The application of 2D pose estimation methods often suffers from performance degeneration because of the severe video quality degradation. To mitigate the problem, a novel model is proposed, namely spatiotemporal net (STNet). STNet utilizes convolution modules to extract the 2D joint heatmaps of each frame and exploits recurrent convolution modules to encode the time information between the adjacent frames. This decoupling learning of spatiotemporal information improves the temporal coherence and spatial accuracy of the estimated poses and reduces the difficulty of extracting spatiotemporal features. The application of ConvGRU effectively reduces the computational cost while ensuring recognition accuracy. Proposed model is compared with other existing methods on two benchmarks: Penn Action and Sub-JHMDB. The results show that STNet can better trade-off prediction performance and computational complexity and have more practical value.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Spatiotemporal Neural Network for Video-Based Pose Estimation

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content