QUICK REVIEW

[Paper Review] Maximizing acquisition functions for Bayesian optimization

James T. Wilson, Frank Hutter|arXiv (Cornell University)|May 25, 2018

Advanced Multi-Objective Optimization Algorithms67 citations

TL;DR

The paper analyzes how to efficiently maximize acquisition functions in Bayesian optimization, showing gradient-based optimization for Monte Carlo estimated acquisitions and greedy maximization guarantees for a family of myopic maximal acquisitions.

ABSTRACT

Bayesian optimization is a sample-efficient approach to global optimization that relies on theoretically motivated value heuristics (acquisition functions) to guide its search process. Fully maximizing acquisition functions produces the Bayes' decision rule, but this ideal is difficult to achieve since these functions are frequently non-trivial to optimize. This statement is especially true when evaluating queries in parallel, where acquisition functions are routinely non-convex, high-dimensional, and intractable. We first show that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization. Subsequently, we identify a common family of acquisition functions, including EI and UCB, whose properties not only facilitate but justify use of greedy approaches for their maximization.

Motivation & Objective

Clarify how acquisition functions estimated via Monte Carlo can be differentiated and optimized.
Show that a common family of acquisition functions is submodular and amenable to greedy maximization with near-optimal guarantees.
Provide practical methods and empirical evidence improving BO performance in parallel and high-dimensional settings.
Explore extensions to discrete events and continuous-to-discrete relaxations for differentiability.

Proposed method

Demonstrate that MC-acquisition functions are differentiable via the reparameterization trick and sample path derivatives.
Provide a reparameterized Gaussian integral formulation for acquisitions, enabling gradient-based optimization.
Show that common MM (myopic maximal) acquisition functions are submodular, guaranteeing near-optimal greedy maximization.
Present an incremental (marginal) view of MM acquisitions, linking EI, PI, SR, and UCB to an EI-like marginal gain.
Offer continuous relaxations to handle discrete events within gradient-based optimization.
Extend UCB to a differentiable MC form suitable for parallel optimization.

Experimental results

Research questions

RQ1Can Monte Carlo estimated acquisition functions be differentiated and optimized efficiently via gradient methods?
RQ2Are myopic maximal acquisition functions submodular, and do greedy methods yield near-optimal query sets in parallel Bayesian optimization?
RQ3How can discrete event criteria be embedded in differentiable acquisition optimization through continuous relaxations?
RQ4What practical gains in BO performance arise from gradient-based MC optimization and incremental MM formulations in parallel and high-dimensional settings?

Key findings

MC acquisition functions can be differentiated and optimized unbiasedly under mild conditions.
A reparameterization enables gradient-based optimization of q-EI and related acquisitions in parallel settings.
MM acquisition functions (EI, PI, SR, UCB) are shown to be submodular, ensuring near-optimal greedy maximization with guarantees.
Incremental marginal gains provide computational advantages and better scalability with increasing joint acquisition dimension.
Continuous relaxations allow differentiability for discrete events, enabling gradient-based optimization.
Empirical results show gains in synthetic and black-box tasks across varying dimensionality and parallelism.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.