Skip to main content
QUICK REVIEW

[论文解读] InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

Lichang Chen, Jiuhai Chen|arXiv (Cornell University)|Jun 5, 2023
Topic Modeling被引用 11
一句话总结

InstructZero 对一个开源 LLM 进行低维度的软提示优化,以为一个黑盒 LLM 生成指令,指导贝叶斯优化在不通过 API 模型进行反向传播的情况下提升零-shot 任务性能。

ABSTRACT

Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden. Instead of directly optimizing the discrete instruction, we optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. On each iteration of the proposed method, which we call InstructZero, a soft prompt is converted into an instruction using the open-source LLM, which is then submitted to the black-box LLM for zero-shot evaluation, and the performance is sent to Bayesian optimization to produce new soft prompts improving the zero-shot performance. We evaluate InstructZero on different combinations of open-source LLMs and APIs including Vicuna and ChatGPT. Our results show that InstructZero outperforms SOTA auto-instruction methods across a variety of downstream tasks. Our code and data are publicly available at https://github.com/Lichang-Chen/InstructZero.

研究动机与目标

  • Automate instruction search to improve zero-shot performance for black-box LLMs.
  • Reduce combinatorial instruction optimization to low-dimensional continuous optimization.
  • Leverage in-context learning of open-source LLMs to generate task-specific instructions.
  • Align latent soft-prompt kernels with instruction similarities to enhance optimization.

提出的方法

  • Transform the discrete instruction search into continuous optimization by learning a soft prompt p for a open-source LLM that generates a task instruction v.
  • Apply a random projection to reduce the soft-prompt dimension from d' to d for tractable optimization.
  • Formulate the objective as a black-box function H(p) measuring zero-shot performance after applying v to the black-box LLM f, and optimize H(p) via Bayesian optimization.
  • Introduce an instruction-coupled kernel that aligns the latent-space prompt similarities with instruction similarities, ensuring BO explores instruction-relevant regions.
  • Use Gaussian Process priors and Expected Improvement as the BO framework to update posteriors and select next prompts.
  • Iterate until convergence to produce the best instruction v* for the target task.

实验结果

研究问题

  • RQ1How can instruction optimization be effectively performed for black-box LLMs without gradient access?
  • RQ2Can a soft prompt in a latent space, coupled with an open-source LLM, generate high-quality instructions for black-box models?
  • RQ3Does an instruction-coupled kernel improve Bayesian optimization efficiency by aligning latent and instruction spaces?
  • RQ4Is InstructZero able to outperform state-of-the-art auto-instruction methods across multiple tasks?
  • RQ5What is the impact of using smaller open-source models to optimize instructions for larger API LLMs?

主要发现

  • InstructZero significantly outperforms baselines (APE and Uniform) on a broad set of tasks.
  • ChatGPT’s zero-shot performance improves when guided by InstructZero-generated instructions, achieving SOTA on 32/32 tasks from BIG-Bench in the reported setting.
  • The method can match or exceed results obtained with larger models by optimizing instructions via a smaller open-source LLM.
  • Ablation shows that optimizing the soft prompt yields substantial gains over manual prompts or using exemplars alone.
  • Visualization indicates progressive improvement of instructions and effective exploration-exploitation in the latent space across iterations.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。