QUICK REVIEW

[論文レビュー] A simple yet effective baseline for non-attributed graph classification

Chen Cai, Yusu Wang|arXiv (Cornell University)|Nov 8, 2018

Advanced Graph Neural Networks参考文献 21被引用数 42

ひとこと要約

本論文は Local Degree Profile (LDP) を紹介します。局所的な次数分布に基づく単純で線形時間のグラフ表現で、非属性グラフにおいて最先端のグラフカーネルやグラフニューラルネットワークと競合する性能を達成し、属性付きグラフに対しても依然として強力なベースラインとなります。

ABSTRACT

Graphs are complex objects that do not lend themselves easily to typical learning tasks. Recently, a range of approaches based on graph kernels or graph neural networks have been developed for graph classification and for representation learning on graphs in general. As the developed methodologies become more sophisticated, it is important to understand which components of the increasingly complex methods are necessary or most effective. As a first step, we develop a simple yet meaningful graph representation, and explore its effectiveness in graph classification. We test our baseline representation for the graph classification task on a range of graph datasets. Interestingly, this simple representation achieves similar performance as the state-of-the-art graph kernels and graph neural networks for non-attributed graph classification. Its performance on classifying attributed graphs is slightly weaker as it does not incorporate attributes. However, given its simplicity and efficiency, we believe that it still serves as an effective baseline for attributed graph classification. Our graph representation is efficient (linear-time) to compute. We also provide a simple connection with the graph neural networks. Note that these observations are only for the task of graph classification while existing methods are often designed for a broader scope including node embedding and link prediction. The results are also likely biased due to the limited amount of benchmark datasets available. Nevertheless, the good performance of our simple baseline calls for the development of new, more comprehensive benchmark datasets so as to better evaluate and analyze different graph learning methods. Furthermore, given the computational efficiency of our graph summary, we believe that it is a good candidate as a baseline method for future graph classification (or even other graph learning) studies.

研究の動機と目的

単純な局所情報ベースのグラフ表現が、非属性グラフ分類で高い性能を発揮できるかを評価する。
LDP ベースラインを、標準データセットにおける最先端のグラフカーネルおよびグラフニューラルネットワークと比較する。
提案手法の計算効率とスケーラビリティを、グラフ分類のベースラインとして評価する。

提案手法

各ノード v に対して、degree(v) および隣接ノードの次数の統計量（最小, 最大, 平均, 標準偏差）DN(v) を計算する。
5つのノード特徴のそれぞれに対してヒストグラムまたは経験的分布関数を適用し、特徴を結合してグラフレベルの特徴を作成する。
結合されたグラフ特徴に対して、線形または非線形の SVM を用いて、10分割交差検証を10回繰り返し、平均精度を報告する。
計算量を分析する：特徴抽出は O(E) で、V の値を B ビンに写像することは O(V)；比較のためにカーネルベースおよびニューラルネットワークのベースラインについて議論する。
学習なしで GNN の本質的要素を LDP が捉えることを示すことによって、グラフニューラルネットワークとの関係を論じ、sum(DN(v)) のような追加特徴量を検討する（最終結果には採用されていない）。
ハイパーパラメータにはビンサイズ、正規化戦略、表現（ヒストグラム対経験的分布）、スケール選択（線形対対数）、SVM のパラメータ C とカーネル帯域幅を含む。

実験結果

リサーチクエスチョン

RQ1単純で学習を伴わない局所特徴表現が、非属性グラフ分類において複雑なグラフカーネルや GNN に対抗できるか？
RQ2標準的な非属性グラフデータセットにおける精度と効率の点で、LDP ベースラインは最先端手法とどのように比較されるか？
RQ3グラフ分類において局所的で非属性の情報だけを用いることの限界は何か、グローバル情報や属性情報が必要となるのはいつか？

主な発見

Local Degree Profile (LDP) ベースラインは、非属性グラフ分類タスクにおいて最先端のグラフカーネルや多くのグラフニューラルネットワークと競合する性能を達成します。
線形 SVM のみでも（表現の学習なし）、LDP は Reddit などいくつかのデータセットで良好な性能を示します。
追加のノードまたはエッジ特徴を付与してもデータセット全体としての改善は限定的であることが多く、純粋に局所的な次数ベースの特徴が非属性グラフに対して驚くほど強力であることを示唆します。一方、高くラベル付けされたデータセット（例：いくつかの化学グラフ）では、よりグローバルな情報や属性情報が必要となる場合があります。
LDP は線形時間の特徴抽出で計算効率が高く、今後のグラフ分類研究における強力なベースラインとして適していることと、より大規模で包括的なベンチマークの必要性を強調します。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。