QUICK REVIEW

[Paper Review] Part-based Graph Convolutional Network for Action Recognition

Kalpit Thakkar, P. J. Narayanan|arXiv (Cornell University)|Sep 13, 2018

Human Pose and Action Recognition1 references132 citations

TL;DR

This paper introduces PB-GCN, a part-based graph convolutional network that partitions the human skeleton into body parts, uses geometric and kinematic node features, and achieves state-of-the-art results on NTURGB+D and HDM05 for skeletal action recognition.

ABSTRACT

Human actions comprise of joint motion of articulated body parts or `gestures'. Human skeleton is intuitively represented as a sparse graph with joints as nodes and natural connections between them as edges. Graph convolutional networks have been used to recognize actions from skeletal videos. We introduce a part-based graph convolutional network (PB-GCN) for this task, inspired by Deformable Part-based Models (DPMs). We divide the skeleton graph into four subgraphs with joints shared across them and learn a recognition model using a part-based graph convolutional network. We show that such a model improves performance of recognition, compared to a model using entire skeleton graph. Instead of using 3D joint coordinates as node features, we show that using relative coordinates and temporal displacements boosts performance. Our model achieves state-of-the-art performance on two challenging benchmark datasets NTURGB+D and HDM05, for skeletal action recognition.

Motivation & Objective

Motivate action recognition from skeletal data using a part-based viewpoint to capture part-specific and inter-part relations.
Propose PB-GCN that partitions the skeleton graph into subgraphs with shared vertices and learns part-wise convolutions.
Show that using geometric (relative coordinates) and motion (temporal displacements) features improves recognition over 3D joint coordinates.
Demonstrate state-of-the-art performance on NTURGB+D and HDM05 datasets with the proposed framework.

Proposed method

Define a general part-based graph convolutional network (PB-GCN) for graphs with known partition properties.
Partition the skeleton graph into multiple overlapping subgraphs representing body parts (e.g., axial and appendicular components).
Perform spatial convolutions independently on each part, then aggregate using a learned fusion function F_agg across parts.
Extend to spatio-temporal graphs by connecting joints temporally within each part and across frames, followed by temporal convolution.
Use relative coordinates and temporal displacements as node features, concatenated, instead of raw 3D joint coordinates.
Incorporate a learnable edge weight mask and residual connections, following a ResNet-like architecture, with 9 SP-Temporal GCN units.

Experimental results

Research questions

RQ1Can partitioning the skeleton graph into meaningful body parts improve action recognition over treating the skeleton as a single graph?
RQ2Do geometric (relative coordinates) and kinematic (temporal displacements) features improve skeletal action recognition when used with PB-GCN?
RQ3What is the impact of different part configurations (1, 2, 4, 6 parts) on recognition accuracy?
RQ4How does PB-GCN compare to state-of-the-art graph-based skeletal action recognition methods on NTURGB+D and HDM05 datasets?

Key findings

PB-GCN with four parts achieves higher accuracy than single-part and other partition schemes on NTURGB+D.
Using both relative coordinates and temporal displacements (D_R || D_T) yields the best performance among tested signals, especially with more parts.
PB-GCN outperforms previous graph-based skeletal action recognition methods on NTURGB+D and HDM05, achieving state-of-the-art results.
Geometric and kinematic cues provide significant gains, with temporal displacements contributing notably to performance.
Shared or separate convolution kernels across parts can be configured; part-based aggregation via F_agg effectively fuses information from multiple parts.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.