QUICK REVIEW

[论文解读] Generating Counterfactual Explanations with Natural Language

Lisa Anne Hendricks, Ronghang Hu|arXiv (Cornell University)|Jun 26, 2018

Explainable Artificial Intelligence (XAI)被引用 53

一句话总结

本文提出一种方法，通过识别对 counter-classes 有区分性的证据、检查其在图像中的存在性、并将其否定来生成流畅的对比性文本说明，从而为图像分类生成反事实文本解释。该方法在 Caltech-UCSD Birds 上进行评测，使用自动指标评估短语错误和对比性文本的影响。

ABSTRACT

Natural language explanations of deep neural network decisions provide an intuitive way for a AI agent to articulate a reasoning process. Current textual explanations learn to discuss class discriminative features in an image. However, it is also helpful to understand which attributes might change a classification decision if present in an image (e.g., "This is not a Scarlet Tanager because it does not have black wings.") We call such textual explanations counterfactual explanations, and propose an intuitive method to generate counterfactual explanations by inspecting which evidence in an input is missing, but might contribute to a different classification decision if present in the image. To demonstrate our method we consider a fine-grained image classification task in which we take as input an image and a counterfactual class and output text which explains why the image does not belong to a counterfactual class. We then analyze our generated counterfactual explanations both qualitatively and quantitatively using proposed automatic metrics.

研究动机与目标

以解释为何图像不属于某个 counter-class（而不仅是为何属于某个类别）为目标并能解释清楚
利用语义、非图像证据来生成信息丰富的对比性陈述
开发端到端流程，预测对比性证据、验证其在图像中的缺失、并生成流畅的对比性文本
在细粒度数据集上评估对比性解释的质量和判别性，提出评估指标

提出的方法

从 counter-class 的解释中提取名词短语来预测候选对比性证据（基于对解释的提取）
使用两个证据检查器验证对比性证据是否存在于图像中：Counterfactual: Classifier 和 Counterfactual: Phrase-Critic
对选定的对比性短语进行否定并撰写与 counter-class 的对比句（如 This is not a X because...）
使用基于规则的否定系统形成最终的对比性句子并附加到基础解释上
可选地通过检索-定位模型对短语进行在图像上的定位，以 informing phrase-critic 评分
在 Caltech-UCSD Birds 数据集上使用短语错误和带对比性文本的准确性进行评估

实验结果

研究问题

RQ1对比性解释是否通过指明影像中缺失的属性来改变类别决策，从而提升可解释性？
RQ2模型在预测并验证不在图像中的对比性证据方面的准确性如何？
RQ3对比性补充是否降低了解释对类别判别性的预测能力，表明其具备判别性？
RQ4哪一种证据检查器（Classifier 与 Phrase-Critic）更能支持稳健的对比性文本生成？

主要发现

两种对比性模型（CF: Classifier 与 CF: Phrase-Critic）在减少生成解释中的短语错误方面均优于基线。
所有模型在添加对比性文本时都会降低句子层面的准确性，表明文本会影响对类别的判别判断。
Phrase-Critic 模型通常在短语定位性能上优于基线和分类器，且短语错误率更低，表明对对比性属性的定位有改进。
基于定位的方法受益于外部数据（如 Visual Genome）和短语级定位，从而更有效地选择对比性证据。
基线在短语错误方面仍然很强，但在降低不正确的对比性提及方面被提出的对比性方法所超越。
定性示例显示对比性解释，如“ This is not a Bobolink because it does not have a yellow nape,”，有助于阐明相近鸟类之间的差异。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。