Skip to main content
QUICK REVIEW

[论文解读] Optimizing Mode Connectivity via Neuron Alignment

N. Joseph Tatro, Pin‐Yu Chen|arXiv (Cornell University)|Jan 1, 2020
Adversarial Robustness in Machine Learning被引用 2
一句话总结

本文提出神经元对齐方法,通过考虑权重排列对称性,优化深度神经网络损失曲面中的模式连通性。通过在模型之间对齐中间激活分布,该方法可找到更低损失的平面化曲线,显著降低对抗性鲁棒模型之间的鲁棒损失屏障,从而提升泛化能力和鲁棒性。

ABSTRACT

The loss landscapes of deep neural networks are not well understood due to their high nonconvexity. Empirically, the local minima of these loss functions can be connected by a learned curve in model space, along which the loss remains nearly constant; a feature known as mode connectivity. Yet, current curve finding algorithms do not consider the influence of symmetry in the loss surface created by model weight permutations. We propose a more general framework to investigate the effect of symmetry on landscape connectivity by accounting for the weight permutations of the networks being connected. To approximate the optimal permutation, we introduce an inexpensive heuristic referred to as neuron alignment. Neuron alignment promotes similarity between the distribution of intermediate activations of models along the curve. We provide theoretical analysis establishing the benefit of alignment to mode connectivity based on this simple heuristic. We empirically verify that the permutation given by alignment is locally optimal via a proximal alternating minimization scheme. Empirically, optimizing the weight permutation is critical for efficiently learning a simple, planar, low-loss curve between networks that successfully generalizes. Our alignment method can significantly alleviate the recently identified robust loss barrier on the path connecting two adversarial robust models and find more robust and accurate models on the path.

研究动机与目标

  • 为解决深度神经网络高维非凸损失曲面缺乏理解的问题。
  • 研究权重排列对称性如何影响模型空间中的模式连通性。
  • 开发一种通过沿路径对齐神经元激活来改进模型间曲线查找的方法。
  • 降低对抗性训练模型之间的鲁棒损失屏障,实现更稳定和准确的插值。

提出的方法

  • 提出神经元对齐作为一种启发式方法,以近似两个深度神经网络之间的最优权重排列。
  • 通过对齐两个模型的中间激活分布,促进插值路径上的结构相似性。
  • 采用邻近交替最小化方案,实证验证对齐排列在局部范围内是最优的。
  • 通过优化网络权重的排列以实现激活对齐,在模型空间中构建平面化、低损失的曲线。
  • 该方法隐式考虑了损失曲面中的权重排列对称性,从而改善了连通性。

实验结果

研究问题

  • RQ1深度神经网络中的权重排列对称性如何影响损失曲面的连通性?
  • RQ2对齐模型之间的中间激活分布是否能带来更低损失的更好插值路径?
  • RQ3神经元对齐是否能降低两个对抗性鲁棒模型之间的鲁棒损失屏障?
  • RQ4通过神经元对齐获得的排列是否在最小化路径损失方面是局部最优的?
  • RQ5对齐路径是否具有更好的泛化能力,从而产生比标准插值更鲁棒和更准确的模型?

主要发现

  • 神经元对齐显著降低了两个对抗性鲁棒模型之间的鲁棒损失屏障,实现了更平滑、更低损失的插值。
  • 通过邻近交替最小化方案,实证验证了神经元对齐获得的排列在局部范围内是最优的。
  • 该方法成功找到了平面化、低损失的曲线,其泛化能力优于标准插值方法。
  • 激活分布对齐使插值路径上的鲁棒性和准确性均得到提升。
  • 该方法表明,在深度学习中,考虑权重排列对称性对于实现有效的模式连通性至关重要。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。