A 3D Human Pose Estimation Method Based on Time-frequency Interaction Fusion

Chen Keng; Zhang Fengquan; Wang Longsheng; Chen Yufeng

doi:10.3724/SP.J.1089.2024-00488

Chen Keng, Zhang Fengquan, Wang Longsheng, Chen Yufeng. A 3D Human Pose Estimation Method Based on Time-frequency Interaction Fusion[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.2024-00488

Citation:

A 3D Human Pose Estimation Method Based on Time-frequency Interaction Fusion

Graphical Abstract

Graphical Abstract

Abstract

Abstract

In order to solve the problems of a large amount of resource waste caused by redundant video frames and low accuracy of 3D human pose estimation under unreliable 2D pose input in Transformer based 3D human pose estimation methods, a time-frequency interaction fusion based 3D human pose estimation method is proposed. Firstly, the spatial module of the network is proposed, and a spatial Transformer based on frequency domain enhancement is designed. Then, a frequency domain multi-layer perceptron is designed based on discrete cosine transform to extract frequency domain features. This perceptron can effectively reduce the computational complexity of the network while utilizing frequency domain feature enhancement to capture the spatial dependencies of joints within frames, thereby improving the accuracy of the network in noisy input data; Then, a time module for the network is proposed, and a time-frequency interactive fusion time Transformer is designed to reduce the computational burden of redundant frames on the model through the interactive fusion of time-frequency features. This not only improves efficiency but also better captures complex changes in the sequence, enhancing the robustness of the model; Finally, a deep convolutional regression module is proposed to process the output features of spatial and temporal modules, achieving accurate mapping from 2D human pose to 3D human pose. On the Human3.6 dataset, the method proposed in this paper is compared to the current mainstream 3D human pose estimation method P-STMO, MHFormer conducted experimental comparisons, and MFLOPs were reduced by 19% and 61%, respectively, while MPJPE was reduced by 1.1% and 1.7%, achieving an optimized balance between performance and accuracy, fully verifying the feasibility of this method and design process.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

A 3D Human Pose Estimation Method Based on Time-frequency Interaction Fusion

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content