QUICK REVIEW

[论文解读] Language-Based Bayesian Optimization Research Assistant (BORA)

A. Cisse, Xenophon Evangelopoulos|ArXiv.org|Jan 27, 2025

AI-based Problem Solving and Planning被引用 3

一句话总结

BORA 将贝叶斯优化与大语言模型结合，以注入领域知识、提供实时注释并在高维、成本高的实验中自适应引导搜索，在合成与真实世界任务中优于基线。

ABSTRACT

Many important scientific problems involve multivariate optimization coupled with slow and laborious experimental measurements. These complex, high-dimensional searches can be defined by non-convex optimization landscapes that resemble needle-in-a-haystack surfaces, leading to entrapment in local minima. Contextualizing optimizers with human domain knowledge is a powerful approach to guide searches to localized fruitful regions. However, this approach is susceptible to human confirmation bias and it is also challenging for domain experts to keep track of the rapidly expanding scientific literature. Here, we propose the use of Large Language Models (LLMs) for contextualizing Bayesian optimization (BO) via a hybrid optimization framework that intelligently and economically blends stochastic inference with domain knowledge-based insights from the LLM, which is used to suggest new, better-performing areas of the search space for exploration. Our method fosters user engagement by offering real-time commentary on the optimization progress, explaining the reasoning behind the search strategies. We validate the effectiveness of our approach on synthetic benchmarks with up to 15 independent variables and demonstrate the ability of LLMs to reason in four real-world experimental tasks where context-aware suggestions boost optimization performance substantially.

研究动机与目标

为高维黑箱优化中慢速初始化和局部最小值的问题提供动机与解决方案。
通过 LLM 将领域知识融入探索，以避免对人类输入的过度依赖。
制定自适应策略以调控 LLM 的参与并平衡探索-利用。
在优化过程中通过可解释的 LLM 注释与假设生成实现用户参与。

提出的方法

将基于高斯过程的贝叶斯优化学习者与 LLM 结合，形成混合优化框架。
重新利用 LLM 以产生结构化的 JSON 风格注释，包含对当前优化状态的洞见与假设。
使用 Experiment Card 用问题背景和目标初始化 LLM，生成初始假设和样本。
实现自适应启发式策略，在基于 GP 不确定性和平台/动态指标的情况下，在 Vanilla BO、LLM 引导的采样和 LLM 引导的选择之间切换。
引入一个信任机制，通过滚动信任分数跟踪 LLM 干预的有效性，并相应地调整 plateau 持续时间 m。

Figure 1: BORA framework. Icons from Flaticon ( 2025 ) .

实验结果

研究问题

RQ1BORA 相较于静态先验或普通 BO 基线，在合成与真实世界任务中是否提高了探索效率和收敛速度？
RQ2LLM 生成的假设与注释是否能显著加速在高维空间中找到高目标值区域？
RQ3自适应策略如何调节 LLM 的参与以在优化过程中的成本、性能和稳定性之间取得平衡？

主要发现

BORA 在高维问题（10D Levy、15D Ackley）以及多项真实世界任务上显著优于基线。
LLM 指导的初始采样有助于识别有前景的区域，动态假设改善探索并减轻停滞。
氢气生产实验显示相对于 ColaBO 的累积遗憾降低了 47%，在复杂空间中实现更快收敛。
具自适应信任与 plateau 调整的动态 BO/LLM 协同能在静态专家先验之外持续带来性能提升。
消融研究表明混合方法优于仅使用 LLM 以及静态知识先验。

Figure 2: The LLM agent commenting and refining its hypotheses on the Sugar Beet Production experiment (complete comment in the SM). This experiment is detailed in Section 4.1 .

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。