A Review of Skeleton-Based Human Action Recognition
-
Graphical Abstract
-
Abstract
Human action recognition plays a vital role in video understanding. In recent years, skeleton-based action recognition approaches have gained widespread attention due to their robustness against environmental interferences. This paper compiles 102 skeleton-based human action recognition methods and comparatively analyzes their performance on nine public datasets. This paper introduces the manual feature and deep learning based methods according to learning paradigms. Specifically, the manual feature methods are divided into three categories, i.e., geometric, kinetic, and statistical representations, in the light of feature descriptor. Meanwhile, the deep learning based methods are classified into five subclasses by backbones, i.e., recurrent neural networks, convolutional neural networks, graph convolutional networks, Transformer, and hybrid networks. Through comprehensive analysis, we not only present the research status of skeleton-based action recognition but also conclude that the existing methods have limitations such as poor generalization ability and high computational cost. Finally, this paper looks into future research directions from the aspects of network structure design, similar action distinction, domain data set expansion, and multi-person interaction.
-
-