QUICK REVIEW

[论文解读] Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

Pu Zhao, Pin‐Yu Chen|arXiv (Cornell University)|Apr 30, 2020

Adversarial Robustness in Machine Learning参考文献 39被引用 33

一句话总结

论文使用训练模型之间的模式连通性，在有限的干净数据下修复被后门攻击或注入错误的DNN，并分析沿着模型之间路径的鲁棒性损失障碍，将鲁棒性与输入Hessian特征值相关联。

ABSTRACT

Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. When network models are tampered with backdoor or error-injection attacks, our results demonstrate that the path connection learned using limited amount of bonafide data can effectively mitigate adversarial effects while maintaining the original accuracy on clean data. Therefore, mode connectivity provides users with the power to repair backdoored or error-injected models. We also use mode connectivity to investigate the loss landscapes of regular and robust models against evasion attacks. Experiments show that there exists a barrier in adversarial robustness loss on the path connecting regular and adversarially-trained models. A high correlation is observed between the adversarial robustness loss and the largest eigenvalue of the input Hessian matrix, for which theoretical justifications are provided. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.

研究动机与目标

Motivate the use of mode connectivity to study adversarial robustness in DNNs.
Develop a path-connection method to repair tampered models using limited bonafide data.
Investigate robustness loss landscapes on paths between regular and adversarially-trained models.
Provide theoretical and empirical links between robustness loss and the largest eigenvalue of the input Hessian.
Demonstrate effectiveness across architectures (VGG/ResNet) and datasets (CIFAR-10/SVHN).

提出的方法

Define a high-accuracy path between two weight configurations w1 and w2 via a parametric curve φθ(t) with t in [0,1].
Minimize an expected loss over the path using a simple Bezier/polygonal parameterization to find θ (path between models).
Train paths using limited bonafide data to repair backdoored or error-injected models and compare to baselines.
Analyze input gradients and Hessians to explain path effectiveness and robustness correlations.
Examine standard vs robust loss landscapes along paths between regular and adversarially-trained models.
Evaluate robustness via PGD attacks and measure correlation with the largest input-Hessian eigenvalue.

实验结果

研究问题

RQ1Can mode connectivity recover accuracy and reduce backdoor vulnerability using limited clean data?
RQ2Can path-based connections repair error-injected models and suppress injected faults?
RQ3What does the robustness loss landscape look like along paths between regular and adversarially-trained models?
RQ4Is there a quantitative relation between robustness loss and the largest eigenvalue of the input Hessian on the path?
RQ5How do these phenomena generalize across architectures and datasets?

主要发现

Path connections with limited bonafide data restore clean accuracy while greatly reducing backdoor attack success rates.
Path-based repair outperforms fine-tuning, training from scratch, pruning, and random perturbations in mitigating backdoors and errors.
Robustness loss exhibits a barrier on paths between regular and adversarially-trained models, supporting no-free-lunch intuition in adversarial robustness.
A strong correlation is observed between robustness loss and the largest eigenvalue of the input Hessian on the path (high PCC values).
Experiments on CIFAR-10 (VGG) and SVHN (ResNet) show consistent behavior across architectures and data, including resilience to adaptive path-aware attacks.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。