QUICK REVIEW

[論文レビュー] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition

Yuxin Chen, Ziqi Zhang|arXiv (Cornell University)|Jul 26, 2021

Human Pose and Action Recognition参考文献 33被引用数 49

ひとこと要約

CTR-GC は共有トポロジーを洗練することによりチャネル固有のトポロジーを動的に学習し、スケルトンベースのアクション認識でより柔軟な特徴集約を実現します。CTR-GCN に統合されると、NTU RGB+D、NTU RGB+D 120、NW-UCLA で最新の結果を達成します。

ABSTRACT

Graph convolutional networks (GCNs) have been widely used and achieved remarkable results in skeleton-based action recognition. In GCNs, graph topology dominates feature aggregation and therefore is the key to extracting representative features. In this work, we propose a novel Channel-wise Topology Refinement Graph Convolution (CTR-GC) to dynamically learn different topologies and effectively aggregate joint features in different channels for skeleton-based action recognition. The proposed CTR-GC models channel-wise topologies through learning a shared topology as a generic prior for all channels and refining it with channel-specific correlations for each channel. Our refinement method introduces few extra parameters and significantly reduces the difficulty of modeling channel-wise topologies. Furthermore, via reformulating graph convolutions into a unified form, we find that CTR-GC relaxes strict constraints of graph convolutions, leading to stronger representation capability. Combining CTR-GC with temporal modeling modules, we develop a powerful graph convolutional network named CTR-GCN which notably outperforms state-of-the-art methods on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.

研究の動機と目的

運動特徴のチャネル別変動を考慮して、スケルトンベースのアクション認識におけるトポロジーモデリングの改善を動機づける。
各チャネルに対してチャネル固有の相関を用いて共有トポロジーを洗練する CTR-GC を提案する。
CTR-GC が CTR-GCN バックボーンへ統合され、優れた性能を発揮できることを示す。

提案手法

チャネル-wise トポロジーを、共有隣接行列 A をチャネル特有の相関 Q で洗練して R を生成するモデルとする。
結合特徴の縮約表現からチャネルごとの関係を計算する相関モデリング関数 M1 または M2 を用いる。
R = A + alpha * Q によって A を R に洗練し、チャネルごとのグラフでチャネル別集約を行って Z を生成する。
4 種のグラフ畳込みスタイルを共通の定式化に統合し、CTR-GC がいくつかの制約を緩和し表現力を高めると主張する。
CTR-GCN を CTR-GC ブロックと時系列モデリングモジュールおよび残差接続で構築し、スケルトンシーケンスアクション認識を行う。

Figure 1: Channel-wise topology refinement. Lines of different colors correspond to topologies in different channels and the thickness of lines indicates the correlation strength between joints.

実験結果

リサーチクエスチョン

RQ1共有トポロジーのチャネル別改良は、スケルトンベースのアクション認識におけるグラフ畳込みの表現力を改善できるか？
RQ2共有トポロジー (A) にチャネル固有の相関 (Q) を組み合わせると、静的/共有可能なトポロジーより測定可能な改善を得られるか？
RQ3NTU_RGB+D、NTU_RGB+D 120、および NW-UCLA データセットで、CTR-GCN は最先端手法と比較してどのように性能を示すか？

主な発見

CTR-GC は、同等のパラメータ数と計算コストで他のグラフ畳込みを大幅に上回る。
CTR-GCN は NTU RGB+D、NTU RGB+D 120、および NW-UCLA データセットで最先端手法を上回る。
アブレーションにより、チャネル特有の相関 (Q) または共有トポロジー改良 (A) を削除すると性能が低下することが示され、チャネル-wise トポロジー改良の重要性が浮き彫りになる。
相関モデリング関数 (M1, M2) のさまざまな設定と縮約率 r の構成は、ベースラインを一貫して上回り、CTR-GC の頑健性と有効性を示している。

Figure 2: Framework of the proposed channel-wise topology refinement graph convolution. The channel-wise topology modeling refines the trainable shared topology with inferred channel-specific correlations. The feature transformation aims at transforming input features into high-level representations

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。