Skip to main content
QUICK REVIEW

[Paper Review] Overfitting in adversarially robust deep learning

Leslie Rice, Eric Wong|arXiv (Cornell University)|Feb 26, 2020
Adversarial Robustness in Machine Learning72 references46 citations
TL;DR

The paper shows that robust overfitting is prevalent in adversarial training across multiple datasets and threat models, and that early stopping often matches or exceeds state-of-the-art adversarial training methods. Regularization and data augmentation offer limited improvements when training to convergence.

ABSTRACT

It is common practice in deep learning to use overparameterized networks and train for as long as possible; there are numerous studies that show, both theoretically and empirically, that such practices surprisingly do not unduly harm the generalization performance of the classifier. In this paper, we empirically study this phenomenon in the setting of adversarially trained deep networks, which are trained to minimize the loss under worst-case adversarial perturbations. We find that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training across multiple datasets (SVHN, CIFAR-10, CIFAR-100, and ImageNet) and perturbation models ($\ell_\infty$ and $\ell_2$). Based upon this observed effect, we show that the performance gains of virtually all recent algorithmic improvements upon adversarial training can be matched by simply using early stopping. We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting. Finally, we study several classical and modern deep learning remedies for overfitting, including regularization and data augmentation, and find that no approach in isolation improves significantly upon the gains achieved by early stopping. All code for reproducing the experiments as well as pretrained model weights and training logs can be found at https://github.com/locuslab/robust_overfitting.

Motivation & Objective

  • Demonstrate that overfitting occurs in adversarially trained networks and harms robust performance.
  • Characterize how learning rate schedules and model complexity affect robust overfitting.
  • Evaluate classical and modern remedies (regularization, data augmentation, semi-supervised learning) for mitigating robust overfitting.
  • Show that early stopping can match or outperform recent adversarial training improvements.

Proposed method

  • Empirical training of adversarially robust models across SVHN, CIFAR-10/100, and ImageNet.
  • Analysis of robust test error versus training progress under different learning rate schedules.
  • Comparison of vanilla PGD, TRADES, and other algorithms.
  • Ablation studies on regularization, data augmentation, and semi-supervised learning.
  • Use of hold-out validation for early stopping and verification of its effect on robust performance.

Experimental results

Research questions

  • RQ1Does overfitting occur in adversarially trained networks, and how does it affect robust test performance?
  • RQ2What learning rate schedules and model complexities influence robust overfitting?
  • RQ3Can regularization, data augmentation, or semi-supervised methods mitigate robust overfitting, and how do they compare to early stopping?
  • RQ4Is early stopping sufficient to match or exceed the robustness gains of newer adversarial training techniques?

Key findings

  • Robust overfitting is a dominant phenomenon in adversarial training, with robust test error increasing after learning rate decay and training continuation.
  • Early stopping can match or exceed state-of-the-art adversarial training gains; vanilla PGD with early stopping can reach robust performance comparable to TRADES on CIFAR-10.
  • Smoother learning rate schedules do not prevent robust overfitting; discrete piecewise decay yields the best robust performance during training.
  • Explicit regularization and standard data augmentation offer limited improvement over early stopping when training converges; semi-supervised augmentation can help when combined with early stopping.
  • Increasing model capacity improves robust test performance despite robust overfitting, indicating double descent and robust overfitting are distinct phenomena.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.