QUICK REVIEW

[論文レビュー] The Computer Science and Physics of Community Detection: Landscapes, Phase Transitions, and Hardness

Cristopher Moore|arXiv (Cornell University)|Feb 1, 2017

Complex Network Analysis Techniques参考文献 15被引用数 59

ひとこと要約

確率的ブロックモデルをコミュニティ検出の相転移に結びつける調査と分析。情報理論的・計算的閾値を示し、信念伝搬法と関連するスペクトル法を導入する。

ABSTRACT

Community detection in graphs is the problem of finding groups of vertices which are more densely connected than they are to the rest of the graph. This problem has a long history, but it is undergoing a resurgence of interest due to the need to analyze social and biological networks. While there are many ways to formalize it, one of the most popular is as an inference problem, where there is a "ground truth" community structure built into the graph somehow. The task is then to recover the ground truth knowing only the graph. Recently it was discovered, first heuristically in physics and then rigorously in probability and computer science, that this problem has a phase transition at which it suddenly becomes impossible. Namely, if the graph is too sparse, or the probabilistic process that generates it is too noisy, then no algorithm can find a partition that is correlated with the planted one---or even tell if there are communities, i.e., distinguish the graph from a purely random one with high probability. Above this information-theoretic threshold, there is a second threshold beyond which polynomial-time algorithms are known to succeed; in between, there is a regime in which community detection is possible, but conjectured to require exponential time. For computer scientists, this field offers a wealth of new ideas and open questions, with connections to probability and combinatorics, message-passing algorithms, and random matrix theory. Perhaps more importantly, it provides a window into the cultures of statistical physics and statistical inference, and how those cultures think about distributions of instances, landscapes of solutions, and hardness.

研究の動機と目的

回復が可能になる条件を調べるために、植え付けられたコミュニティ構造を持つ確率モデルを提示する。
疎グラフ内の検出・弱再構成・完全再構成における相転移を探索する。
推定問題を事後分布とハミルトニアンを通じて統計物理学と結びつける。
信念伝搬などのアルゴリズム的アプローチとその理論的限界について論じる。

提案手法

q個のグループとグループ内/グループ間の確率 p_in および p_out を用いて確率的ブロックモデルを形式化する。
事後分布 P(σ|G) をボルツマン分布に写し、Ising/Pottsエネルギー H(σ) に関連付ける。
一定次数レジームでの弱再構成・完全再構成および検出を定義し、閾値を特定する。
キャビティ法と信念伝搬を用いて周辺分布を計算し、相転移を評価する。
BP の線形安定性（Kesten-Stigum）閾値を導出し、ノンバックトラッキングスペクトル法と結びつける。

実験結果

リサーチクエスチョン

RQ1疎グラフにおいて植え付けられたコミュニティ構造を検出し、Erdős-Rényi グラフと区別できるのか？
RQ2確率的ブロックモデルにおける検出・弱再構成・完全再構成の正確な閾値は何か？
RQ3情報理論的閾値を上回る検出可能な再構成を実現するうえで、信念伝搬は最適か？
RQ4疎グラフにおける事後分布と周辺分布から相転移がどのように生じるか？
RQ5BPの固定点、安定性とコミュニティ検出のスペクトルアルゴリズムとの関係は何か？

主な発見

検出と再構成において、不可能、可能だが難しい、実現可能な領域を分ける情報理論的閾値と計算的閾値が存在する。
Kesten-Stigum閾値を超えると弱再構成が達成可能で、BPベースの手法がいくつかのケースで有効であることが示されている。
事後分布をボルツマン分布に写像でき、コミュニティ検出をIsing/Pottsモデルと相転移に結びつける。
信念伝搬は疎性の下でラベリングの期待精度を最大化する周辺分布を与え、その固定点の安定性が検出可能性を予測する。
ノンバックトラッキングスペクトル法は検出可能性の閾値と一致し、それを超える領域で効率的なアルゴリズムを提供する。
疎で局所的に木のようなグラフでは BP は漸近的に正しく、実ネットワークの短いループはそれを低下させることがあるが過度には崩さない。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。