QUICK REVIEW

[论文解读] Deeper Insights into Weight Sharing in Neural Architecture Search

Yuge Zhang, Zejun Lin|arXiv (Cornell University)|Jan 6, 2020

Advanced Neural Network Applications参考文献 30被引用 35

一句话总结

这篇论文通过实证分析权重共享神经网络架构搜索（NAS），揭示子模型排名的高度不稳定性和方差，并证明部分权重共享可以稳定排名并提高性能。

ABSTRACT

With the success of deep neural networks, Neural Architecture Search (NAS) as a way of automatic model design has attracted wide attention. As training every child model from scratch is very time-consuming, recent works leverage weight-sharing to speed up the model evaluation procedure. These approaches greatly reduce computation by maintaining a single copy of weights on the super-net and share the weights among every child model. However, weight-sharing has no theoretical guarantee and its impact has not been well studied before. In this paper, we conduct comprehensive experiments to reveal the impact of weight-sharing: (1) The best-performing models from different runs or even from consecutive epochs within the same run have significant variance; (2) Even with high variance, we can extract valuable information from training the super-net with shared weights; (3) The interference between child models is a main factor that induces high variance; (4) Properly reducing the degree of weight sharing could effectively reduce variance and improve performance.

研究动机与目标

评估权重共享在多次运行和不同训练周期中对 NAS 准确性与稳定性的影响。
量化共享权重超网中子模型的方差和干扰。
确定驱动不稳定性的机制并探索减少其影响的策略。
提出并评估部分权重共享方案以提升 NAS 性能。

提出的方法

构建一个下采样的 NAS 搜索空间，包含 64 个可能的子模型，以实现真实对照比较。
使用共享权重训练一个单一的超网并在验证集上评估所有子模型。
将共享权重的表现与通过独立训练每个子模型获得的 ground-truth 表现进行对比。
用 Kendall’s tau 来衡量排序稳定性（S-Tau、GT-Tau），并用 Top-n-Rank 评估顶级模型的发现能力。
通过分析超网训练过程中每个小批次中子模型之间的干扰来研究方差来源。
进行部分权重共享的实验，包括分组共享和前缀共享，以减少方差并研究对排名的影响。

实验结果

研究问题

RQ1在多次运行或训练周期中使用权重共享时，子模型的排名有多稳定？
RQ2相较于从头重新训练，超网的共享权重在多大程度上可以帮助选择高性能的子模型？
RQ3在权重共享条件下，子模型之间方差和干扰的主要来源是什么？
RQ4降低权重共享的程度（部分共享）是否可以提高稳定性并与 ground truth 性能对齐？

主要发现

在权重共享下，子模型的排名在多次运行和训练周期中高度不稳定。
共享权重训练可以接近但无法达到 ground-truth 排名，变异显著。
共同训练的子模型之间的干扰是排名不稳定的主要原因。
部分权重共享策略（分组、基于相似性的分组、前缀共享）可以降低方差，使排名更接近 ground truth，但带来不同的计算权衡。
从超网快照微调子模型，即使额外训练有限，也能显著提升排名质量。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。