高级检索

    冒鑫鑫, 吴胜昔, 咸博龙, 顾幸生. 基于骨架的自适应图卷积和LSTM行为识别[J]. 华东理工大学学报(自然科学版), 2022, 48(6): 816-825. DOI: 10.14135/j.cnki.1006-3080.20210625001
    引用本文: 冒鑫鑫, 吴胜昔, 咸博龙, 顾幸生. 基于骨架的自适应图卷积和LSTM行为识别[J]. 华东理工大学学报(自然科学版), 2022, 48(6): 816-825. DOI: 10.14135/j.cnki.1006-3080.20210625001
    MAO Xinxin, WU Shengxi, XIAN Bolong, GU Xingsheng. Adaptive Graph Convolution and LSTM Action Recognition Based on Skeleton[J]. Journal of East China University of Science and Technology, 2022, 48(6): 816-825. DOI: 10.14135/j.cnki.1006-3080.20210625001
    Citation: MAO Xinxin, WU Shengxi, XIAN Bolong, GU Xingsheng. Adaptive Graph Convolution and LSTM Action Recognition Based on Skeleton[J]. Journal of East China University of Science and Technology, 2022, 48(6): 816-825. DOI: 10.14135/j.cnki.1006-3080.20210625001

    基于骨架的自适应图卷积和LSTM行为识别

    Adaptive Graph Convolution and LSTM Action Recognition Based on Skeleton

    • 摘要: 针对骨架行为识别任务的识别精确度问题,提出了一种自适应图卷积和长短时记忆相结合的模型(AAGC-LSTM)。该模型以捕获人体骨架运动的时空共现特征为出发点,提取运动特征时打破以人体自然骨架为固有图卷积邻接矩阵的束缚,利用自适应图卷积与长短时记忆神经网络的结合进行时空共现特征的提取。为了捕获行为识别任务的关键节点信息,嵌入了空间注意力模块,将人体骨架信息以一种动态的方式进行结合,同时将骨骼关节点一级运动信息和骨骼边二级运动信息送入模型组成双流分支并进行融合以提高模型识别的准确率。该模型在NTU RGB+D数据集的Cross Subject和Cross View协议下分别取得了90.1%和95.6%的准确率,在North Western数据集上取得了93.6%的准确率,验证了该模型在提取骨架运动时空特征和行为识别任务上的优越性。

       

      Abstract: Aiming at the accuracy problem of action recognition task, this paper proposes an adaptive graph convolution and long short-term memory (AAGC-LSTM) based model. By capturing the spatial-temporal co-occurrence features of human skeleton motion, this model breaks the constraint of using the natural human skeleton as the inherent adjacency matrix in graph convolution, and combines both the adaptive graph convolution and LSTM to achieve the extraction of spatial-temporal co-occurrence-features. In order to capture the key nodes’ information of the action recognition task, an attention module is embedded into the proposed model to combine the human skeleton information in a dynamic way. Meanwhile, the primary motion information of skeleton joints and secondary motion information of skeleton edges are integrated into the AAGC-LSTM model separately to form the two branches, and are further merged to improve the accuracy of recognition. It is shown via experiments that the proposed model can achieve 90.1% and 95.6% accuracy on the NTU RGB+D dataset under the Cross Subject and Cross View metric, respectively, and 93.6% accuracy on the North Western dataset, which verifies its superior in extracting skeleton motion spatial-temporal features and action recognition task.

       

    /

    返回文章
    返回