研究动机与目标

Motivate the study of NAS as a logical step in automating architecture engineering for deep learning.
Categorize NAS methods by search space, search strategy, and performance estimation strategy.
Survey historical and current approaches (RL, evolution, BO, gradient-based) and efficiency techniques for performance estimation.

提出的方法

Review and categorize NAS literature along three dimensions: search space, search strategy, and performance estimation.
Describe common search spaces (chain-structured, multi-branch with skip connections, and cell-based spaces) and discuss their implications for optimization.
Summarize diverse search strategies (reinforcement learning, neuro-evolution, Bayesian optimization, gradient-based methods, random/hill-climbing) and their pros/cons.
Explain performance estimation strategies to reduce computational cost, including low-fidelity proxies, learning curve extrapolation, weight inheritance/morphisms, one-shot models, and their biases.

RQ1What are the main components and design choices in NAS (search space, search strategy, performance estimation) and how do they interact?
RQ2How do different NAS approaches (RL, evolution, BO, gradient-based) compare in terms of efficiency and final performance?
RQ3What techniques exist to speed up NAS performance estimation, and what biases do they introduce?
RQ4What future directions could broaden NAS beyond image classification and into multi-task, multi-objective, or robust architectures?

NAS methods have outperformed manual architectures on tasks like image classification, object detection, and semantic segmentation in various works.
Cell-based search spaces reduce search space size and improve transferability across datasets, enabling faster search and reuse of learned designs.
One-shot and weight-sharing NAS drastically reduce computational resources to a few GPU days but introduce potential biases and correlations that require careful analysis.
Gradient-based NAS (continuous relaxations) enables efficient optimization but depends on relaxation quality and may require discretization post-optimization.
RL and evolution offer competitive performance with different trade-offs in anytime performance and architecture sizes; random search is generally outperformed in reported studies.
There is evidence that lower-fidelity performance estimates speed up search but can bias architecture ranking if fidelity gaps are large.