QUICK REVIEW

[论文解读] Interpretations are useful: penalizing explanations to align neural networks with prior knowledge

Laura Rieger, Chandan Singh|arXiv (Cornell University)|Sep 30, 2019

Explainable Artificial Intelligence (XAI)参考文献 49被引用 37

一句话总结

CDEP 为神经网络增加基于解释的正则化，惩罚模型解释以与领域知识对齐并减少对伪相关的依赖，从而在各任务上提高准确性和公平性。

ABSTRACT

For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany of proposed explainable deep learning methods stop at the first step, providing practitioners with insight into a model, but no way to act on it. In this paper, we propose contextual decomposition explanation penalization (CDEP), a method which enables practitioners to leverage existing explanation methods in order to increase the predictive accuracy of deep learning models. In particular, when shown that a model has incorrectly assigned importance to some features, CDEP enables practitioners to correct these errors by directly regularizing the provided explanations. Using explanations provided by contextual decomposition (CD) (Murdoch et al., 2018), we demonstrate the ability of our method to increase performance on an array of toy and real datasets.

研究动机与目标

激发对能够支持对模型实施可操作改进的解释的需求，而不仅仅是洞察。
引入 Contextual Decomposition Explanation Penalization (CDEP) 通过解释注入领域知识。
展示惩罚解释可以减少对伪特征的依赖并提升跨数据集的泛化能力。
展示 CDEP 对各种架构与任务的高效性和适用性。

提出的方法

在损失中增加一个解释损失项，用于惩罚模型解释与用户提供的目标（expl_X）之间的偏离。
使用 Contextual Decomposition (CD) 来获得特征重要性与交互（beta(x_S), gamma(x)）。
对 CD 分数应用 SoftMax 以得到概率，并在解释项中通过 L1 损失与 expl_X 进行比较。
将 CDEP 泛化到任何可微的解释方法；提供一个以 lambda 作为正则化权重的具体实现。
将领域知识编码为真值解释，包括识别伪区域或非显著特征的规则。
突出基于 CD 的归因相对于梯度基于的解释在内存和前向/后向传播效率方面的计算优势。

实验结果

研究问题

RQ1通过 CDEP 惩罚解释是否能够引导学习朝向对预测的正确、符合领域的原因？
RQ2CDEP 是否减少对伪信号的依赖并在数据集偏差和分布偏移下提升泛化？
RQ3与基于梯度的归因惩罚相比，CDEP 在视觉、语言和公正相关任务上的表现如何？
RQ4相对于梯度基的方法，基于 CD 的解释惩罚有哪些计算优势？

主要发现

CDEP 在提升预测性能的同时使解释与跨数据集的先验知识保持一致。
在 ISIC 皮肤癌数据集上，CDEP 减少对伪相关斑块的依赖，并在有偏和无偏的测试集上提升 AUC 和 F1。
在 ColorMNIST 上，CDEP 将模型从颜色线索转向形状线索，在偏置数据上实现高于基线的准确性。
在 COMPAS 数据集上，CDEP 减少了不同种族之间的误判率差异，同时不牺牲准确性。
在 SST 数据集上，通过在偏置文本数据上训练，CDEP 通过忽略注入的伪信号来提高准确性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。