Multi-Scale Spatial-Temporal Feature Fusion for 3D Human Pose Estimation

Zhang Yu; Liu Li; Fu Xiaodong; Liu Lijun; Peng Wei

doi:10.3724/SP.J.1089.2023-00041

Zhang Yu, Liu Li, Fu Xiaodong, Liu Lijun, Peng Wei. Multi-Scale Spatial-Temporal Feature Fusion for 3D Human Pose Estimation[J]. Journal of Computer-Aided Design & Computer Graphics, 2025, 37(1): 75-88. DOI: 10.3724/SP.J.1089.2023-00041

Citation:

Multi-Scale Spatial-Temporal Feature Fusion for 3D Human Pose Estimation

Graphical Abstract

Graphical Abstract

Abstract

Abstract

To address the problem of inaccurate representation, inadequate fusion and unsmooth results in video-based single person three-dimensional human pose estimation, a multi-scale spatial-temporal feature fusion method is proposed. Firstly, the joint, limb and upper/lower body tokens were defined in spatial domain to represent the spatial multi-scale features of human body using positional embeddings. Secondly, the spatial multi-scale feature fusion module was constructed based on self-attention mechanism and multilayer perceptron to fuse joint, limb and upper/lower body features, obtaining initial pose feature sequence. Lastly, the temporal multi-scale encoding was established for temporal feature fusion to acquire final pose feature sequence, and optimize the generation of refined three-dimensional human pose through temporal decoding. Experimental results on Human3.6M dataset show that the mean per joint position error and joint velocity errors are 33.6 and 2.4 respectively, which reduce by 2.3% and 4.0%. The proposed method can improve three-dimensional human pose estimation accuracy and generate precise and smooth results while reducing computational cost. Furthermore, experimental results on HumanEva-I dataset show that the proposed method also has a certain degree of generalization ability.

FullText(HTML)

References (34)

Cited By

Turn off MathJax

Article Contents

Multi-Scale Spatial-Temporal Feature Fusion for 3D Human Pose Estimation

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content