Spatial-Temporal Feature Reinforcement Learning for 3D Human Pose and Shape Estimation

Yan Zhuoyue; Liu Li; Fu Xiaodong; Liu Lijun; Peng Wei

doi:10.3724/SP.J.1089.2024-00483

Yan Zhuoyue, Liu Li, Fu Xiaodong, Liu Lijun, Peng Wei. Spatial-Temporal Feature Reinforcement Learning for 3D Human Pose and Shape Estimation[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.2024-00483

Citation:

Spatial-Temporal Feature Reinforcement Learning for 3D Human Pose and Shape Estimation

Graphical Abstract

Graphical Abstract

Abstract

Abstract

To address the problem of insufficient spatial-temporal modeling, complex local dependence and weak es-timation robustness in three-dimensional human pose and shape estimation from single-view video, a spa-tial-temporal feature reinforcement learning method is proposed. Firstly, the global spatial-temporal feature reinforcement module was constructed to extract static features from the input video sequences, and global correlation modeling and global temporal features fusion were conducted for two sub-sequences containing intermediate frame, and the integrated temporal features was obtained. Secondly, the spatial-temporal dual branch encoder composed of graph convolution and self-attention mechanism was designed to model the local dependence of human body for local spatial-temporal feature reinforcement learning, and the refined three-dimensional pose was obtained. Finally, the global-local spatial-temporal feature fusion method based on dual attention mechanism was proposed to fuse the temporal, pose and shape feature, and the final estimated three-dimensional human body mesh was obtained. Experimental results on Human3.6M dataset show that the PA-MPJPE and MPJPE are 36.0 mm and 49.7 mm respectively, which are reduced by 0.6 mm and 1.9 mm compared with the comparison method. The proposed method can improve three-dimensional human pose and shape estimation accuracy, and generate precise and smooth three-dimensional human body. Furthermore, the testing results on 3DPW dataset and Internet videos show that the proposed method also has a certain degree of robustness when facing the challenges of occlusion limb, different background and scene conditions.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Spatial-Temporal Feature Reinforcement Learning for 3D Human Pose and Shape Estimation

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content