QUICK REVIEW

[论文解读] Adversarial Filters of Dataset Biases

Ronan Le Bras, Swabha Swayamdipta|arXiv (Cornell University)|Feb 10, 2020

Adversarial Robustness in Machine Learning被引用 125

一句话总结

本文分析 AFLite，一种基于模型的迭代过滤方法，用以去除数据集偏差以降低虚假相关性，在提高跨分布鲁棒性（out-of-distribution generalization）的同时，在主要基准数据集上显著降低分布内性能。

ABSTRACT

Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples. This raises the question of whether these models have learned to solve a dataset rather than the underlying task by overfitting to spurious dataset biases. We investigate one recently proposed approach, AFLite, which adversarially filters such dataset biases, as a means to mitigate the prevalent overestimation of machine performance. We provide a theoretical understanding for AFLite, by situating it in the generalized framework for optimum bias reduction. We present extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks. Finally, filtering results in a large drop in model performance (e.g., from 92% to 62% for SNLI), while human performance still remains high. Our work thus shows that such filtered datasets can pose new research challenges for robust generalization by serving as upgraded benchmarks.

研究动机与目标

激发/提出数据集偏差导致高估模型性能的问题。
提供理论框架和实现最优偏差降低的实用近似。
在 NLP 与视觉任务上进行实证验证 AFLite。
展示过滤偏差如何影响分布内与分布外的性能。

提出的方法

给出一个形式化的表示-偏差目标，并将 AFOpt 设为理想但不可计算的偏差降低目标。
引入 AFLite，作为可扩展的近似方法，利用可预测性分数 p(i) 迭代地移除高度可预测的样本。
通过在数据的随机划分上训练的模型的样本外预测来计算 p(i)。
使用贪心切片过程，在每次迭代中移除最高的前 k 个 p(i) 的实例，直到达到尺寸 n 或遇到提前停止 τ。
应用预计算的特征表示 Φ(X) 和一个模型族 M 来估计可预测性。
在 NLP 与视觉基准上演示该方法，包含 SNLI、MultiNLI、QNLI 与 ImageNet。

实验结果

研究问题

RQ1AFLite 能否可靠地移除超出显式已知伪证据的数据集偏差？
RQ2在 AFLite 过滤数据上训练的模型是否在分布外任务上具有更好的泛化能力？
RQ3AFLite 如何影响 NLP 与视觉数据集的分布内基准性能？
RQ4AFLite 是否在不同特征表示和模型族上具有鲁棒性？

主要发现

AFLite 减少可检测的数据集偏差，使基准对模型更具挑战性，同时保持人类性能相对较高。
在 AFLite 过滤数据上训练的模型在分布外任务如 HANS、NLI Diagnostics、Stress tests 以及 Adversarial NLI 上显示出更好的泛化。
在 SNLI 上，AFLite 过滤数据显著降低分内模型准确率（例如，对强模型而言从高 90% 降至大约 60% 左右），而人类绩效保持较高。
在 NLP 中，AFLite 降低 RoBERTa、BERT 和 ESIM+GloVe 基线的分内准确率，表明移除了偏见但易而简的样本。
在视觉方面，在 AFLite 过滤的 ImageNet 数据上训练，在对抗性分布外集上获得约 2% 的绝对增益，尽管在标准验证中的分内下降很大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。