QUICK REVIEW

[论文解读] Learning to Learn without Gradient Descent by Gradient Descent

Yutian Chen, Matthew W. Hoffman|arXiv (Cornell University)|Nov 11, 2016

Higher Education Learning Practices被引用 162

一句话总结

本文在合成函数上训练循环神经网络优化器，以实现快速、可迁移的黑箱优化，在包括超参数调优和控制任务在内的多种设置中，与贝叶斯优化方法相媲美，甚至在某些情况下优于它们。

ABSTRACT

We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.

研究动机与目标

Motivate fast, general-purpose black-box optimization beyond Bayesian methods.
Develop meta-learned optimizers that learn exploration-exploitation trade-offs.
Demonstrate transfer of learned optimizers to derivative-free problems across domains.
Show computational gains over standard BO packages in training-horizon scenarios.

提出的方法

Model a black-box optimizer as an RNN with shared parameters that updates its hidden state and proposes the next query point.
Train the RNN by backpropagating through time using a loss that sums objective values over a finite horizon (L_sum).
Experiment with losses that encourage exploration, such as expected improvement (EI) and observed improvement (OI).
Train function distributions are generated from Gaussian process priors to provide differentiable training signals.
Extend the framework to parallel evaluations by augmenting inputs with a feedback flag and simulating out-of-order completions.
Compare learned optimizers to Spearmint, TPE, and SMAC, and evaluate on transfer tasks including GP bandits, control, and hyper-parameter tuning.
Use differentiable architectures (DNC and LSTM) for the optimizer and assess their speed at test time.

实验结果

研究问题

RQ1Can a learned RNN-based optimizer, trained on simple synthetic functions, effectively optimize a wide range of black-box functions?
RQ2Do learned optimizers transfer to derivative-free optimization domains beyond their training distribution?
RQ3How do different meta-learning losses (sum, EI, OI) influence exploration-exploitation balance and performance?
RQ4What are the computational advantages of learned optimizers relative to established Bayesian optimization packages?
RQ5Can parallel evaluation be integrated into the learned optimization framework without performance loss?

主要发现

Spearmint	TPE	SMAC	DNC	LSTM
1239	16.3	16.3	0.1	0.02
1238	16.2	16.2	0.1	0.02
1524	19.3	19.3	0.1	0.02
2768	20.8	20.8	0.1	0.02

Learned RNN optimizers transfer to GP bandits, control objectives, global optimization benchmarks, and ML hyper-parameter tuning.
DNC-based optimizers trained with EI or OI losses outperform direct-observation DNCs and are competitive with, and often faster than, Spearmint, SMAC, and TPE within a 100-step horizon.
Optimizers are orders of magnitude faster than traditional BO methods at test time (rough runtime improvements of up to 10^4× in reported cases).
With higher input dimensions, learned optimizers outperform baseline BO methods in the training horizon.
Parallel proposal schemes maintain performance while offering substantial speedups in hyper-parameter tuning scenarios.
The approach achieves competitive results on standard benchmarks and simple control problems, often matching engineered optimizers.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。