QUICK REVIEW

[論文レビュー] Algorithm-Agnostic Explainability for Unsupervised Clustering

Charles A. Ellis, Mohammad S.E. Sendi|arXiv (Cornell University)|May 17, 2021

Explainable Artificial Intelligence (XAI)参考文献 44被引用数 27

ひとこと要約

本論文は、複数のアルゴリズムに跨る教師なしクラスタリングを説明するための、グローバル/ローカルなアルゴリズム非依存の説明可能性手法（G2PCとL2PC）を二つ提案し、合成データと高次元のfMRI connectivityデータ上で実証した。

ABSTRACT

Supervised machine learning explainability has developed rapidly in recent years. However, clustering explainability has lagged behind. Here, we demonstrate the first adaptation of model-agnostic explainability methods to explain unsupervised clustering. We present two novel "algorithm-agnostic" explainability methods - global permutation percent change (G2PC) and local perturbation percent change (L2PC) - that identify feature importance globally to a clustering algorithm and locally to the clustering of individual samples. The methods are (1) easy to implement and (2) broadly applicable across clustering algorithms, which could make them highly impactful. We demonstrate the utility of the methods for explaining five popular clustering methods on low-dimensional synthetic datasets and on high-dimensional functional network connectivity data extracted from a resting-state functional magnetic resonance imaging dataset of 151 individuals with schizophrenia and 160 controls. Our results are consistent with existing literature while also shedding new light on how changes in brain connectivity may lead to schizophrenia symptoms. We further compare the explanations from our methods to an interpretable classifier and find them to be highly similar. Our proposed methods robustly explain multiple clustering algorithms and could facilitate new insights into many applications. We hope this study will greatly accelerate the development of the field of clustering explainability.

研究の動機と目的

クラスタリングの説明可能性を、モデル非依存の説明可能性を教師なしクラスタリングに適用することで前進させる。
グローバルおよびローカルで、さまざまなクラスタリングアルゴリズムに対して解釈可能な特徴重要度を提供する。
低次元の合成データとfMRIから得られる高次元の脳結合データで有用性を実証する。

提案手法

グローバルな置換百分比変化（G2PC）を開発し、クラスタリングアルゴリズム全体を横断して特徴重要度を定量化する。
個々のサンプルの特徴重要度を定量化する局所的撹乱百分比変化（L2PC）を開発する。
複数のクラスタリングアルゴリズム（アルゴリズム非依存）に対する適用性を示す。
合成データセットと高次元の安静時fMRI connectivityデータで説明を検証する。
説明と解釈可能な分類器との類似性を評価して、説明の妥当性を比較する。

実験結果

リサーチクエスチョン

RQ1モデル非依存の説明可能性技術を、教師なしクラスタリングを説明するために適用できるか。
RQ2G2PCとL2PCは、グローバルにもサンプルごとにも、異なるクラスタリングアルゴリズム間で安定的で意味のある特徴重要度を提供するか。
RQ3説明は解釈可能な分類器からの解釈および既存の脳結合文献と一致しますか。

主な発見

G2PCとL2PCは、いくつかのクラスタリング手法に対して、グローバルおよび局所的な特徴重要度をうまく特定する。
手法は実装が容易で、クラスタリングアルゴリズムを跨いで広く適用可能である。
統合失調症群と対照群のfMRI connectivityデータに対する説明は、既存文献と一致し、脳結合の変化に関する新たな洞察を提供する。
提案手法からの説明は、解釈可能な分類器と高い類似性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。