QUICK REVIEW

[论文解读] Neural Architecture Search with Reinforce and Masked Attention Autoregressive Density Estimators.

Chepuri Shri Krishna, Ashish Gupta|arXiv (Cornell University)|Jun 1, 2020

Machine Learning and Data Classification被引用 2

一句话总结

本文提出了一种基于强化学习的神经架构搜索（NAS）方法，采用掩码注意力自回归模型作为策略网络，从而在 NASBench-101 上实现更高效的搜索。通过训练一组共享参数的策略网络，每个策略分别基于不同的自回归因子分解顺序，该方法实现了最先进性能，优于先前的策略梯度方法和随机搜索。

ABSTRACT

Neural Architecture Search has become a focus of the Machine Learning community. Techniques span Bayesian optimization with Gaussian priors, evolutionary learning, reinforcement learning based on policy gradient, Q-learning, and Monte-Carlo tree search. In this paper, we present a reinforcement learning algorithm based on policy gradient that uses an attention-based autoregressive model to design the policy network. We demonstrate how performance can be further improved by training an ensemble of policy networks with shared parameters, each network conditioned on a different autoregressive factorization order. On the NASBench-101 search space, it outperforms most algorithms in the literature, including random search. In particular, it outperforms RL methods based on policy gradients that use alternate architectures to specify the policy network, underscoring the importance of using masked attention in this setting. We have adhered to guidelines listed in Lindauer& Hutter (2019) while designing experiments and reporting results.

研究动机与目标

通过使用更具表达能力的策略网络，提升基于强化学习的神经架构搜索（NAS）性能。
解决先前在NAS中使用非自回归或结构较弱的架构进行策略建模的策略梯度方法的局限性。
探究自回归因子分解顺序对NAS中策略性能的影响。
证明掩码注意力机制可增强策略的泛化能力与NAS中的搜索效率。
依据Lindauer & Hutter（2019）的指南，在标准化基准上验证该方法。

提出的方法

采用基于策略梯度的强化学习框架进行NAS，其中策略网络逐步选择架构操作。
使用带有掩码自注意力的自回归模型来建模策略，实现对先前选择的注意力机制，支持序列化架构生成。
构建一组策略网络，每个策略基于架构搜索空间的不同自回归因子分解顺序进行条件化。
在集成成员之间共享参数，以提升样本效率并减少过拟合。
应用REINFORCE算法，利用NASBench-101中训练模型的奖励信号来优化策略。
使用掩码注意力以强制实现因果性，确保架构组件的有效自回归生成。

实验结果

研究问题

RQ1与标准前馈或RNN-based策略相比，基于注意力的自回归模型是否能提升强化学习-based NAS中的策略表征能力？
RQ2将策略基于多种自回归因子分解顺序进行条件化，是否能提升搜索性能与鲁棒性？
RQ3该方法在NASBench-101基准上与随机搜索及其他强化学习-based NAS方法相比表现如何？
RQ4在集成策略中共享参数在多大程度上提升了样本效率与泛化能力？
RQ5在架构搜索背景下，掩码注意力是否对有效的自回归建模至关重要？

主要发现

所提方法在NASBench-101搜索空间上优于随机搜索，证明了使用结构化策略网络的显著优势。
该方法在NASBench-101上实现了基于强化学习的NAS算法的最先进性能，优于其他采用不同策略架构的策略梯度方法。
采用基于不同自回归因子分解顺序的策略集成可提升性能，表明因子分解顺序显著影响搜索质量。
在策略网络中引入掩码注意力，相比无注意力基线，能更好地建模序列化的架构选择。
在集成成员之间共享参数可提升训练稳定性与样本效率，且不损失性能。
该方法遵循NAS评估的最佳实践，结果依据Lindauer & Hutter（2019）的标准化指南报告。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。