QUICK REVIEW

[论文解读] InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

Lichang Chen, Jiuhai Chen|arXiv (Cornell University)|Jun 5, 2023

Topic Modeling被引用 11

一句话总结

InstructZero 对一个开源 LLM 进行低维度的软提示优化，以为一个黑盒 LLM 生成指令，指导贝叶斯优化在不通过 API 模型进行反向传播的情况下提升零-shot 任务性能。

ABSTRACT

Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden. Instead of directly optimizing the discrete instruction, we optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. On each iteration of the proposed method, which we call InstructZero, a soft prompt is converted into an instruction using the open-source LLM, which is then submitted to the black-box LLM for zero-shot evaluation, and the performance is sent to Bayesian optimization to produce new soft prompts improving the zero-shot performance. We evaluate InstructZero on different combinations of open-source LLMs and APIs including Vicuna and ChatGPT. Our results show that InstructZero outperforms SOTA auto-instruction methods across a variety of downstream tasks. Our code and data are publicly available at https://github.com/Lichang-Chen/InstructZero.

研究动机与目标

Automate instruction search to improve zero-shot performance for black-box LLMs.
Reduce combinatorial instruction optimization to low-dimensional continuous optimization.
Leverage in-context learning of open-source LLMs to generate task-specific instructions.
Align latent soft-prompt kernels with instruction similarities to enhance optimization.

提出的方法

Transform the discrete instruction search into continuous optimization by learning a soft prompt p for a open-source LLM that generates a task instruction v.
Apply a random projection to reduce the soft-prompt dimension from d' to d for tractable optimization.
Formulate the objective as a black-box function H(p) measuring zero-shot performance after applying v to the black-box LLM f, and optimize H(p) via Bayesian optimization.
Introduce an instruction-coupled kernel that aligns the latent-space prompt similarities with instruction similarities, ensuring BO explores instruction-relevant regions.
Use Gaussian Process priors and Expected Improvement as the BO framework to update posteriors and select next prompts.
Iterate until convergence to produce the best instruction v* for the target task.

实验结果

研究问题

RQ1How can instruction optimization be effectively performed for black-box LLMs without gradient access?
RQ2Can a soft prompt in a latent space, coupled with an open-source LLM, generate high-quality instructions for black-box models?
RQ3Does an instruction-coupled kernel improve Bayesian optimization efficiency by aligning latent and instruction spaces?
RQ4Is InstructZero able to outperform state-of-the-art auto-instruction methods across multiple tasks?
RQ5What is the impact of using smaller open-source models to optimize instructions for larger API LLMs?

主要发现

InstructZero significantly outperforms baselines (APE and Uniform) on a broad set of tasks.
ChatGPT’s zero-shot performance improves when guided by InstructZero-generated instructions, achieving SOTA on 32/32 tasks from BIG-Bench in the reported setting.
The method can match or exceed results obtained with larger models by optimizing instructions via a smaller open-source LLM.
Ablation shows that optimizing the soft prompt yields substantial gains over manual prompts or using exemplars alone.
Visualization indicates progressive improvement of instructions and effective exploration-exploitation in the latent space across iterations.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。