QUICK REVIEW

[论文解读] NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search

Arber Zela, Julien Siems|arXiv (Cornell University)|Jan 28, 2020

Advanced Neural Network Applications参考文献 29被引用 61

一句话总结

介绍 NAS-Bench-1Shot1，一个复用 NAS-Bench-101 以低成本评估 one-shot NAS 方法的基准框架，能够分析轨迹、超参数敏感性，以及与黑盒优化器的比较。

ABSTRACT

One-shot neural architecture search (NAS) has played a crucial role in making NAS methods computationally feasible in practice. Nevertheless, there is still a lack of understanding on how these weight-sharing algorithms exactly work due to the many factors controlling the dynamics of the process. In order to allow a scientific study of these components, we introduce a general framework for one-shot NAS that can be instantiated to many recently-introduced variants and introduce a general benchmarking framework that draws on the recent large-scale tabular benchmark NAS-Bench-101 for cheap anytime evaluations of one-shot NAS methods. To showcase the framework, we compare several state-of-the-art one-shot NAS methods, examine how sensitive they are to their hyperparameters and how they can be improved by tuning their hyperparameters, and compare their performance to that of blackbox optimizers for NAS-Bench-101.

研究动机与目标

提供一个通用的 one-shot NAS 基准框架，能够实例化最近的变体。
通过复用 NAS-Bench-101 计算实现对 one-shot NAS 方法的廉价、随时评估。
比较最先进的 one-shot NAS 方法并评估超参数敏感性及通过调参可能的改进。
提供一个统一的代码库，以复现并公平比较 one-shot NAS 组件与离散 NAS 优化器。

提出的方法

在 NAS-Bench-101 搜索空间表示与 one-shot NAS 表示之间建立映射，以查询 one-shot 方法找到的离散架构。
构建三个来自 NAS-Bench-101、复杂度各异的 NAS-Bench-1Shot1 搜索空间（搜索空间 1–3），其中最大的空间包含 363,648 个架构。
在单一代码库中提供实现和评估 one-shot NAS 方法（如 DARTS、GDAS、PC-DARTS、ENAS、Random WS）的通用框架。
在搜索阶段跟踪架构权重，并查询 NAS-Bench-101 的测试/验证误差，以在不重新训练每个架构的情况下分析完整轨迹。
考察 one-shot 模型排序与真实 NAS-Bench-101 性能之间的相关性。
评估超参数的鲁棒性，并通过超参数优化（BOHB）展示可调性。
开放源码实现以促进可重复性和公平基准测试。

实验结果

研究问题

RQ1在一个 anytime 评估框架中，当其架构与 NAS-Bench-101 的评估进行跟踪时，不同的一-shot NAS 方法表现如何？
RQ2在不同的搜索空间中，一-shot 验证性能与真实 NAS-Bench-101 测试性能之间的相关性如何？
RQ3一-shot NAS 方法对超参数有多大敏感性，调参是否能提高相对于离散 NAS 优化器的性能？
RQ4是否存在一个统一框架，能够在同一个代码库上公正比较各种 one-shot NAS 变体，且不受混杂因素影响？
RQ5在多大程度上超参数优化可以降低过拟合并提升跨搜索空间的架构质量？

主要发现

GDAS 在基准测试中提供了最佳的 anytime 性能，但由于温度退火可能过早收敛到次优极小值。
DARTS 和 PC-DARTS 展现出 one-shot 验证误差的下降，但并不总是与 NAS-Bench-101 测试误差的下降一致。
对于 DARTS、PC-DARTS、GDAS、Random WS，one-shot 验证排序与 NAS-Bench-101 测试性能之间几乎没有相关性；而 ENAS 在某些空间中显示出一定相关性。
通过 BOHB 的超参数调优显著改善结果，最佳配置超越默认设置，有时甚至超越离散 NAS 优化器。
调参降低了搜索时间（例如 DARTS 调参示例从 45 GPU 日缩短到 16 GPU 的 1 天），并在各个搜索空间中展示出鲁棒配置。
Random WS 和 ENAS 一般表现不佳，原因是在 one-shot 缩放与真实架构性能之间相关性较弱。
该框架 enabling 高效、公正的比较与结果复现，并凸显了通过超参数驱动的一 shot NAS 的潜力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。