QUICK REVIEW

[Paper Review] Understanding and correcting pathologies in the training of learned optimizers

Luke Metz, Niru Maheswaranathan|arXiv (Cornell University)|Oct 24, 2018

Advanced Neural Network Applications47 citations

TL;DR

The paper introduces a variational outer objective with a combined reparameterization and evolutionary strategies gradient estimator to stabilize training of learned optimizers, enabling faster wall-clock optimization of CNNs than hand-designed methods in targeted task distributions.

ABSTRACT

Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially for specific problems. However, learned optimizers are notoriously difficult to train and have yet to demonstrate wall-clock speedups over hand-designed optimizers, and thus are rarely used in practice. Typically, learned optimizers are trained by truncated backpropagation through an unrolled optimization process resulting in gradients that are either strongly biased (for short truncations) or have exploding norm (for long truncations). In this work we propose a training scheme which overcomes both of these difficulties, by dynamically weighting two unbiased gradient estimators for a variational loss on optimizer performance, allowing us to train neural networks to perform optimization of a specific task faster than tuned first-order methods. We demonstrate these results on problems where our learned optimizer trains convolutional networks faster in wall-clock time compared to tuned first-order methods and with an improvement in test loss.

Motivation & Objective

Motivate learning optimization algorithms tailored to specific tasks rather than general hand-designed optimizers.
Address training pathologies in learned optimizers, including gradient bias from truncated backpropagation and exploding gradients.
Propose a stable outer-objective based on a smoothed loss and two unbiased gradient estimators.
Demonstrate that a learned optimizer can train convolutional networks faster in wall-clock time and improve test loss on target tasks.

Proposed method

Define inner- and outer-loop optimization for learning optimizers.
Analyze bias and exploding gradients in long unrolls via TBPTT.
Introduce a variational outer objective L(theta) = E_{tilde_theta ~ N(theta, sigma^2 I)}[L(tilde_theta)].
Develop two unbiased gradient estimators: g_rp (reparameterization) and g_es (evolutionary strategies).
Merge gradients using inverse-variance weighting to stabilize updates (g_merged).
Apply antithetic sampling to reduce gradient variance.
Use a curriculum of increasing inner unrolls to improve stability and performance.

Experimental results

Research questions

RQ1Can a variational outer objective with combined gradient estimators stabilize training of learned optimizers under long unrolls?
RQ2Do learned optimizers trained with this approach outperform tuned hand-designed optimizers in wall-clock time on CNN inner tasks?
RQ3Does outer-training against validation loss improve generalization to unseen task distributions?
RQ4What is the impact of gradient estimator choice and unroll schedule on optimizer performance?

Key findings

A combined gradient estimator with a variational outer objective enables longer unrolls without exploding gradients.
Learned optimizers trained with this method outperform hand-designed optimizers like SGD+Momentum, RMSProp, and Adam in wall-clock time on target CNN tasks.
Optimizers trained against the validation objective achieve faster convergence and lower test loss on held-out tasks.
The learned optimizer generalizes to out-of-distribution tasks such as MNIST with different architectures and input sizes.
Ablation studies show the gradient estimator and increasing unroll curriculum are critical for performance.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.