[논문 리뷰] Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
본 논문은 학습된 모델들 간의 모드 연결(mode connectivity)을 이용해 백도어가 주입되었거나 오류가 내재된 DNN을 제한된 깨끗한 데이터로 복구하고, 모델 간 경로 상의 로버스트니스 로스 장애를 분석하며 로버스트니스와 입력 Hessian 고유값 간의 연결고리를 제시한다.
Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. When network models are tampered with backdoor or error-injection attacks, our results demonstrate that the path connection learned using limited amount of bonafide data can effectively mitigate adversarial effects while maintaining the original accuracy on clean data. Therefore, mode connectivity provides users with the power to repair backdoored or error-injected models. We also use mode connectivity to investigate the loss landscapes of regular and robust models against evasion attacks. Experiments show that there exists a barrier in adversarial robustness loss on the path connecting regular and adversarially-trained models. A high correlation is observed between the adversarial robustness loss and the largest eigenvalue of the input Hessian matrix, for which theoretical justifications are provided. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.
연구 동기 및 목표
- Motivate the use of mode connectivity to study adversarial robustness in DNNs.
- Develop a path-connection method to repair tampered models using limited bonafide data.
- Investigate robustness loss landscapes on paths between regular and adversarially-trained models.
- Provide theoretical and empirical links between robustness loss and the largest eigenvalue of the input Hessian.
- Demonstrate effectiveness across architectures (VGG/ResNet) and datasets (CIFAR-10/SVHN).
제안 방법
- Define a high-accuracy path between two weight configurations w1 and w2 via a parametric curve φθ(t) with t in [0,1].
- Minimize an expected loss over the path using a simple Bezier/polygonal parameterization to find θ (path between models).
- Train paths using limited bonafide data to repair backdoored or error-injected models and compare to baselines.
- Analyze input gradients and Hessians to explain path effectiveness and robustness correlations.
- Examine standard vs robust loss landscapes along paths between regular and adversarially-trained models.
- Evaluate robustness via PGD attacks and measure correlation with the largest input-Hessian eigenvalue.
실험 결과
연구 질문
- RQ1Can mode connectivity recover accuracy and reduce backdoor vulnerability using limited clean data?
- RQ2Can path-based connections repair error-injected models and suppress injected faults?
- RQ3What does the robustness loss landscape look like along paths between regular and adversarially-trained models?
- RQ4Is there a quantitative relation between robustness loss and the largest eigenvalue of the input Hessian on the path?
- RQ5How do these phenomena generalize across architectures and datasets?
주요 결과
- Path connections with limited bonafide data restore clean accuracy while greatly reducing backdoor attack success rates.
- Path-based repair outperforms fine-tuning, training from scratch, pruning, and random perturbations in mitigating backdoors and errors.
- Robustness loss exhibits a barrier on paths between regular and adversarially-trained models, supporting no-free-lunch intuition in adversarial robustness.
- A strong correlation is observed between robustness loss and the largest eigenvalue of the input Hessian on the path (high PCC values).
- Experiments on CIFAR-10 (VGG) and SVHN (ResNet) show consistent behavior across architectures and data, including resilience to adaptive path-aware attacks.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.