QUICK REVIEW

[论文解读] Adversarially Regularising Neural NLI Models to Integrate Logical Background Knowledge

Pasquale Minervini, Sebastian Riedel|arXiv (Cornell University)|Aug 26, 2018

Adversarial Robustness in Machine Learning参考文献 28被引用 43

一句话总结

该论文生成对抗性 NLI 示例，违反一阶逻辑背景规则，并利用它们对神经 NLI 模型进行对抗性正则化，提升对对抗性输入的鲁棒性并减少背景知识的违反。

ABSTRACT

Adversarial examples are inputs to machine learning models designed to cause the model to make a mistake. They are useful for understanding the shortcomings of machine learning models, interpreting their results, and for regularisation. In NLP, however, most example generation strategies produce input text by using known, pre-specified semantic transformations, requiring significant manual effort and in-depth understanding of the problem and domain. In this paper, we investigate the problem of automatically generating adversarial examples that violate a set of given First-Order Logic constraints in Natural Language Inference (NLI). We reduce the problem of identifying such adversarial examples to a combinatorial optimisation problem, by maximising a quantity measuring the degree of violation of such constraints and by using a language model for generating linguistically-plausible examples. Furthermore, we propose a method for adversarially regularising neural NLI models for incorporating background knowledge. Our results show that, while the proposed method does not always improve results on the SNLI and MultiNLI datasets, it significantly and consistently increases the predictive accuracy on adversarially-crafted datasets -- up to a 79.6% relative improvement -- while drastically reducing the number of background knowledge violations. Furthermore, we show that adversarial examples transfer among model architectures, and that the proposed adversarial training procedure improves the robustness of NLI models to adversarial examples.

研究动机与目标

动机与研究自然语言推理（NLI）中的对抗性示例，将其视为对逻辑背景知识的违反。
开发基于优化的方法来生成违反一阶逻辑规则的对抗性示例。
提出一种对抗性训练方案，使用这些对抗性示例对 NLI 模型进行正则化。
在 SNLI 和 MultiNLI 上评估鲁棒性提升与背景知识违反情况。

提出的方法

将蕴含、矛盾和中性表示为二值谓词，以使用一阶逻辑规则 R1–R5 编码背景知识。
定义不一致性损失 J_I，通过对替换集 S 测量规则的违反，比较 p(con|s1,s2) 与 p(con|s2,s1)，使用 Gödel t-norm 进行合取。
用语言模型约束生成的对抗性示例，以维持低困惑度和语言学上的可信度。
对替换集 S 进行优化，以使 J_I(S) 最大化，且满足 log p_L(S) ≤ τ，从而生成对抗性句子。
在对抗性正则化中，在训练期间联合最小化数据损失 J_D 与最大化的不一致性损失 λ max_S J_I(S;Θ)（式(6)）。
使用一种迭代过程（算法1），交替进行生成对抗性替换和更新模型参数。

实验结果

研究问题

RQ1是否可以生成对抗性示例，使之显著违反为 NLI 预设的逻辑背景规则？
RQ2对抗性正则化是否在提升对抗性输入的鲁棒性的同时，保持或提升标准 NLI 的准确性？
RQ3对抗性示例是否在不同的 NLI 架构之间具有迁移性？

主要发现

对抗性正则化在对抗性构建的数据集上实现最高 79.6% 的相对准确率提升。
在对正则化后，所有评估模型均显示背景知识违反的减少，尽管 SNLI/MultiNLI 的准确度提升并不总是显著。
对抗性示例在模型架构之间具有迁移性，表明存在跨模型的鲁棒性效应。
正则化后的模型对对抗性输入显示出更强的鲁棒性，在训练数据上的规则违反率（例如 R2–对称性）显著下降。
若不进行正则化，模型往往违反逻辑背景知识，并出现数据集伪影，限制真正的蕴涵理解。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。