A Hierarchical Graph Convolutional Method for Gesture Recognition Based on Temporal-Frequency Data Fusion
-
Graphical Abstract
-
Abstract
Graph Convolutional Neural Networks (GCNs) are adept at extracting non-Euclidean structural features and have become the mainstream method for skeleton-based gesture recognition. In response to the issues of long training and inference times, as well as the need for improved accuracy in GCN methods, a hierarchical graph convolutional gesture recognition method based on the fusion of temporal-frequency data, named HHTS-Net, is proposed. Firstly, a framework that combines spatial attention blocks with graph convolutional modules is proposed to reduce computational burden. Then, a hierarchical graph convolutional module tailored to the structural characteristics of the palm and finger joints is introduced to enhance feature extraction efficiency. Finally, a multi-stream learning scheme that integrates frequency domain features is proposed to further improve model performance. Experiments conducted on public datasets DHG and SHREC’17 with authoritative methods demonstrate that HHTS-Net's accuracy is improved by at least 1.8% and 0.3%, respectively, and inference speed is increased by more than 47.6% and 52.9%. Extensive ablation experiments validate the effectiveness of this method. The source code is available at https://github.com/CoderHoooK/HHTSATNet.
-
-