QUICK REVIEW

[论文解读] ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Yibo Yang, Hongyang Li|arXiv (Cornell University)|Oct 13, 2020

Advanced Neural Network Applications参考文献 60被引用 36

一句话总结

ISTA-NAS 将 NAS 表述为稀疏编码问题，在压缩空间中进行可微分搜索，并用 ISTA 恢复稀疏架构；它提供两阶段和一阶段方法，在提高效率和搜索与评估的一致性方面表现出色。

ABSTRACT

Neural architecture search (NAS) aims to produce the optimal sparse solution from a high-dimensional space spanned by all candidate connections. Current gradient-based NAS methods commonly ignore the constraint of sparsity in the search phase, but project the optimized solution onto a sparse one by post-processing. As a result, the dense super-net for search is inefficient to train and has a gap with the projected architecture for evaluation. In this paper, we formulate neural architecture search as a sparse coding problem. We perform the differentiable search on a compressed lower-dimensional space that has the same validation loss as the original sparse solution space, and recover an architecture by solving the sparse coding problem. The differentiable search and architecture recovery are optimized in an alternate manner. By doing so, our network for search at each update satisfies the sparsity constraint and is efficient to train. In order to also eliminate the depth and width gap between the network in search and the target-net in evaluation, we further propose a method to search and evaluate in one stage under the target-net settings. When training finishes, architecture variables are absorbed into network weights. Thus we get the searched architecture and optimized parameters in a single run. In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search. Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.

研究动机与目标

促使缩小在 NAS 搜索中使用的密集超网络与用于评估的稀疏目标网络之间的差距。
将 NAS 表述为稀疏编码问题，以在搜索过程中强制稀疏性。
开发一种基于 ISTA 架构恢复的压缩空间可微分搜索。
引入一种将搜索和评估在目标网络设定下统一的一阶段 ISTA-NAS。

提出的方法

将 NAS 表述为稀疏编码问题，其中每个中间节点选择一组稀疏连接。
使用测量矩阵 A 在压缩空间中进行可微分搜索，并利用 ISTA 恢复稀疏架构 z。
通过基于 RIP 的论证建立压缩空间搜索与原始空间架构之间的等价性。
两阶段 ISTA-NAS 在基于 ISTA 的 z 恢复与在稀疏子图上更新网络权重和架构参数之间交替。
一阶段 ISTA-NAS 在单次运行中完成搜索和评估，通过将最终架构参数并入 BatchNorm 参数和网络权重来实现。

实验结果

研究问题

RQ1是否可以将 NAS 表述为稀疏编码问题，以在搜索过程中强制稀疏性？
RQ2在压缩空间中进行可微分搜索并使用 ISTA 恢复的架构，是否比事后稀疏化更符合评估？
RQ3在目标网络设定下，一阶段的 ISTA-NAS 是否能够消除搜索与评估之间的差距？
RQ4与最先进的基于梯度的 NAS 方法相比，ISTA-NAS 在 CIFAR-10 和 ImageNet 上的效率与性能提升有哪些？

主要发现

两阶段 ISTA-NAS 在 CIFAR-10 上达到 2.54% 的测试误差，搜索成本为 0.05 GPU-日。
一阶段 ISTA-NAS 在单次运行中实现统一的搜索和评估，在 CIFAR-10 上达到 2.36% 的测试误差。
在 ImageNet 上，一阶段 ISTA-NAS 直接达到 top-1 24.0% 和 top-5 7.1%，其基于梯度的搜索成本低于许多两阶段方法。
在 CIFAR-10 上，两阶段 ISTA-NAS 将搜索成本降低到 0.03–0.05 GPU-day，具体取决于批次大小，并且搜索与评估之间的相关性有所提升。
在 CIFAR-10 上，该一阶段方法在所比较方法中展示了最佳的 Kendall tau 相关性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。