QUICK REVIEW

[论文解读] Natural Language Adversarial Defense through Synonym Encoding

Xiaosen Wang, Hao Jin|arXiv (Cornell University)|Sep 15, 2019

Adversarial Robustness in Machine Learning参考文献 43被引用 33

一句话总结

SEM 在前端插入同义词编码，将同义词簇映射到唯一编码，在原始数据上训练以防御同义词替换对抗攻击，同时不改变模型架构，在保持良性数据准确性的同时提升鲁棒性。

ABSTRACT

In the area of natural language processing, deep learning models are recently known to be vulnerable to various types of adversarial perturbations, but relatively few works are done on the defense side. Especially, there exists few effective defense method against the successful synonym substitution based attacks that preserve the syntactic structure and semantic information of the original text while fooling the deep learning models. We contribute in this direction and propose a novel adversarial defense method called Synonym Encoding Method (SEM). Specifically, SEM inserts an encoder before the input layer of the target model to map each cluster of synonyms to a unique encoding and trains the model to eliminate possible adversarial perturbations without modifying the network architecture or adding extra data. Extensive experiments demonstrate that SEM can effectively defend the current synonym substitution based attacks and block the transferability of adversarial examples. SEM is also easy and efficient to scale to large models and big datasets.

研究动机与目标

推动针对基于同义词替换的对手的鲁棒 NLP 模型。
提出一种在不改变模型架构、且避免额外数据或大规模重新训练的防御方法。
开发一个编码器，在输入层之前将同义词组聚合为共享编码。
展示在多数据集和多种架构下对大模型的可扩展性。

提出的方法

构建一个编码器 E，通过在嵌入空间对同义词进行聚类，将它们映射到一个公共编码。
在不改变架构的前提下，将 E 插入到模型输入层之前，并使用标准数据进行训练。
使用欧氏距离对同义词进行聚类，采用 Syn(w, delta, k)；通过实验确定 k 和 delta。
在 counter-fitting 之后对 GloVe 向量实现该编码器，以强制执行同义词约束。
调整超参数 delta 和 k，使鲁棒性与良性准确性之间取得平衡（delta ~ 0.5，k ~ 10）。
在三个数据集上，对 CNN、LSTM、Bi-LSTM 和 BERT 进行评估，针对三种同义词替换攻击（GSA、PWWS、GA）评估 SEM。

实验结果

研究问题

RQ1在不修改模型或不需要额外数据的情况下，基于同义词的前端编码器是否能够提升对同义词替换攻击的鲁棒性？
RQ2在常见的基于同义词的扰动下，SEM 在不同架构（CNN、RNN、BERT）和数据集上的表现如何？
RQ3同义词编码超参数（delta、k）以及遍历顺序对鲁棒性和良性准确性的影响是什么？
RQ4SEM 是否影响对抗样本在模型之间的可转移性？
RQ5在良性数据的准确性和攻击鲁棒性方面，SEM 与对抗训练和 IBP 相比如何？

主要发现

SEM 保持接近于正常训练的良性准确性，在鲁棒性方面优于 IBP，权衡更少。
在 GSA、PWWS、GA 攻击下，SEM 在 IMDB、AG’s News 和 Yahoo! Answers 上对 CNN、LSTM、Bi-LSTM 和 BERT 的鲁棒性显著提升。
SEM 显著降低对抗样本的可转移性，当对手是在其他模型上生成对抗样本时，攻击模型的准确性更高。
超参数分析表明 delta 约为 0.5、k 约为 10 在鲁棒性与良性准确性之间取得有利的权衡。
基于词频的遍历顺序提升鲁棒性，高频词对防御性能的贡献更大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。