QUICK REVIEW

[論文レビュー] EvoPrompt: Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Qingyan Guo, Rui Wang|arXiv (Cornell University)|Sep 15, 2023

Topic Modeling被引用数 30

ひとこと要約

EvoPrompt は進化的アルゴリズムを用いて、LLMs に EA 演算子を模倣させることで離散的なプロンプトを自動生成し、勾配やパラメータアクセスなしでプロンプト品質を向上させる。

ABSTRACT

Large Language Models (LLMs) excel in various tasks, but they rely on carefully crafted prompts that often demand substantial human effort. To automate this process, in this paper, we propose a novel framework for discrete prompt optimization, called EvoPrompt, which borrows the idea of evolutionary algorithms (EAs) as they exhibit good performance and fast convergence. To enable EAs to work on discrete prompts, which are natural language expressions that need to be coherent and human-readable, we connect LLMs with EAs. This approach allows us to simultaneously leverage the powerful language processing capabilities of LLMs and the efficient optimization performance of EAs. Specifically, abstaining from any gradients or parameters, EvoPrompt starts from a population of prompts and iteratively generates new prompts with LLMs based on the evolutionary operators, improving the population based on the development set. We optimize prompts for both closed- and open-source LLMs including GPT-3.5 and Alpaca, on 31 datasets covering language understanding, generation tasks, as well as BIG-Bench Hard (BBH) tasks. EvoPrompt significantly outperforms human-engineered prompts and existing methods for automatic prompt generation (e.g., up to 25% on BBH). Furthermore, EvoPrompt demonstrates that connecting LLMs with EAs creates synergies, which could inspire further research on the combination of LLMs and conventional algorithms.

研究の動機と目的

自動的で離散的なプロンプト設計の必要性を動機づけ、人間の労力を削減し、モデル間の有効性を向上させる。
進化アルゴリズムとLLMsを結びつけて、一貫性があり人間にも読みやすいプロンプトを生成する枠組みを提案する。
オープンソースとクローズドソースのLLMsの両方で、多様なNLPタスクに対するアプローチの有効性を示す。

提案手法

離散的プロンプト最適化を進化的プロセス（プロンプトの集団）として定式化する。
EA演算子（交叉/突然変異）を実装するためにLLMsを使用して新しいプロンプトを生成する。
開発セット上で候補を評価し、上位のプロンプトを保持する。
GA（遺伝的アルゴリズム）とDE（差分進化）という2つのEAで EvoPrompt を具体化する。
プロンプトの一貫性を保ちつつ、LLMs がEAステップを実行するよう指示を提供する。

実験結果

リサーチクエスチョン

RQ1勾配やパラメータにアクセスせずに、LLMs は進化演算子を実装して離散的プロンプトを生成できるか？
RQ2EvoPrompt ベースのプロンプトは、タスクとモデルを横断して、手作成プロンプトや以前の自動手法を上回るか？
RQ3どの EA バリアント（GA vs DE）が、異なるプロンプトのベースラインやデータセットの下でより良い結果をもたらすか？

主な発見

手法	SST-2	CR	MR	SST-5	AG’s News	TREC	主観性	Avg.
MI (Zhang et al., 2023b)	93.68	91.40	88.75	42.90	70.63	50.60	49.75	71.07
NI (Mishra et al., 2022c)	92.86	90.90	89.60	48.64	48.89	55.00	52.55	68.21
PromptSource (Bach et al., 2022)	93.03	-	-	-	45.43	36.20	-	-
APE (Zhou et al., 2022)	94.01	90.50	90.90	46.97	71.18	59.60	63.25	73.77
EvoPrompt (GA)	94.84	91.20	90.40	49.37	73.42	63.80	67.90	75.85
EvoPrompt (DE)	94.84	91.35	90.15	48.19	73.33	64.40	77.60	77.12

GA および DE を用いた EvoPrompt は Alpaca-7b と GPT-3.5 を用いた言語理解タスクで manual prompts および APE を上回る。
感情分類では EvoPrompt (GA) が EvoPrompt (DE) をやや上回る；主観性分類では EvoPrompt (DE) が顕著に EvoPrompt (GA) を上回る。
言語生成では EvoPrompt (GA) および EvoPrompt (DE) は要約で ROUGE、簡略化で SARI を改善し、DE がより大きな利得を生むことが多い。
DE は一般に局所最適解からの脱出を助け、特に初期プロンプトが弱いまたは多様な場合に有利。
このアプローチは LLM のパラメータや勾配へのアクセスを必要とせず、人間が読みやすいプロンプトを生み出す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。