QUICK REVIEW

[论文解读] When Explanations Lie: Why Many Modified BP Attributions Fail

Leon Sixt, Maximilian Granz|arXiv (Cornell University)|Dec 20, 2019

Explainable Artificial Intelligence (XAI)参考文献 47被引用 42

一句话总结

本论文表明，大多数修改后的反向传播归因方法在跨层上坍缩为单一主导方向（秩-1），使解释在很大程度上不依赖后层参数，DeepLIFT 作为显著例外；它引入了余弦相似性收敛（CSC）度量来诊断这一行为。

ABSTRACT

Attribution methods aim to explain a neural network's prediction by highlighting the most relevant image areas. A popular approach is to backpropagate (BP) a custom relevance score using modified rules, rather than the gradient. We analyze an extensive set of modified BP methods: Deep Taylor Decomposition, Layer-wise Relevance Propagation (LRP), Excitation BP, PatternAttribution, DeepLIFT, Deconv, RectGrad, and Guided BP. We find empirically that the explanations of all mentioned methods, except for DeepLIFT, are independent of the parameters of later layers. We provide theoretical insights for this surprising behavior and also analyze why DeepLIFT does not suffer from this limitation. Empirically, we measure how information of later layers is ignored by using our new metric, cosine similarity convergence (CSC). The paper provides a framework to assess the faithfulness of new and existing modified BP methods theoretically and empirically. For code see: https://github.com/berleon/when-explanations-lie

研究动机与目标

评估在常见体系结构（VGG-16、ResNet-50）和数据集（CIFAR-10、ImageNet）上修改后的 BP 归因方法的可信度。
解释为何大量修改后的 BP 规则未通过健全性检查和类别敏感性测试。
提供理论与经验工具以诊断反向传播解释中向(rank-1)矩阵收敛的现象。
就何时以及如何可以信任或需要修正修改后的 BP 方法提供指导。

提出的方法

理论分析表明 z+ 规则在跨层中产生非负矩阵乘积，在假设下收敛到秩-1 矩阵。
定义并使用 Cosine Similarity Convergence (CSC) 度量来量化跨层相关性的收敛情况。
在网络（VGG-16、ResNet-50）和数据集（CIFAR-10、ImageNet）上进行经验评估，使用随机对数得分和参数随机化健全性检查。
比较多种修改后的 BP 方法（LRP 变体、Deep Taylor Decomposition、PatternAttribution、DeepLIFT、Guided BP、Deconv、RectGrad），并将 DeepLIFT 识别为例外。
通过奇异值比率分析 PatternAttribution 和 PatternNet 以理解收敛行为。
引入一个 DeepLIFT 消融变体，该变体解耦正链和负链以展示收敛属性。

实验结果

研究问题

RQ1修改后的 BP 归因方法是否产生依赖后层参数的解释？
RQ2这些方法是否普遍跨层收敛到单一主导方向，降低类别敏感性？
RQ3为何 DeepLIFT 避免收敛到 rank-1，是否可从中获得的见解改善其他方法？
RQ4像 CSC 这样的度量是否能可靠诊断归因方法的收敛性和可信度？

主要发现

大多数修改后的 BP 方法（不含 DeepLIFT）收敛到秩-1 矩阵，使解释对后层不敏感。
z+ 规则及相关方法在所述条件下产生一个非负矩阵链，其乘积收敛到秩-1 矩阵。
CSC 有效跟踪在跨层的属性链中，来自后层的信息如何丢失。
DeepLIFT 不遵循相同的收敛，并且在其正/负分离规则下可以避免秩-1 崩溃。
否定相关性被确认为若干收敛方法所缺失的关键因素，提示提升类别敏感性的潜在途径。
在多种体系结构上，收敛方法在最后一层参数被改变或对数替换随机时，得到的显著性图高度相似。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。