QUICK REVIEW

[논문 리뷰] EvoPrompt: Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Qingyan Guo, Rui Wang|arXiv (Cornell University)|2023. 09. 15.

Topic Modeling인용 수 30

한 줄 요약

EvoPrompt는 LLM이 EA 연산자를 모방하도록 하여 그래디언트나 매개변수 접근 없이도 이산 프롬프트를 자동으로 생성하고 프롬프트 품질을 향상시킵니다.

ABSTRACT

Large Language Models (LLMs) excel in various tasks, but they rely on carefully crafted prompts that often demand substantial human effort. To automate this process, in this paper, we propose a novel framework for discrete prompt optimization, called EvoPrompt, which borrows the idea of evolutionary algorithms (EAs) as they exhibit good performance and fast convergence. To enable EAs to work on discrete prompts, which are natural language expressions that need to be coherent and human-readable, we connect LLMs with EAs. This approach allows us to simultaneously leverage the powerful language processing capabilities of LLMs and the efficient optimization performance of EAs. Specifically, abstaining from any gradients or parameters, EvoPrompt starts from a population of prompts and iteratively generates new prompts with LLMs based on the evolutionary operators, improving the population based on the development set. We optimize prompts for both closed- and open-source LLMs including GPT-3.5 and Alpaca, on 31 datasets covering language understanding, generation tasks, as well as BIG-Bench Hard (BBH) tasks. EvoPrompt significantly outperforms human-engineered prompts and existing methods for automatic prompt generation (e.g., up to 25% on BBH). Furthermore, EvoPrompt demonstrates that connecting LLMs with EAs creates synergies, which could inspire further research on the combination of LLMs and conventional algorithms.

연구 동기 및 목표

자동적이고 이산적인 프롬프트 설계의 필요성을 제시하여 인간의 노력을 줄이고 교차 모델의 효과를 향상시킵니다.
진화 알고리즘과 LLM을 결합하여 일관되고 읽기 쉬운 프롬프트를 생성하는 프레임워크를 제안합니다.
다양한 NLP 작업에 걸쳐 오픈 소스 및 폐쇄형 LLM에서 이 접근법의 효과를 입증합니다.

제안 방법

이산 프롬프트 최적화를 프롬프트의 모음인 진화 과정으로 프레이밍합니다.
새 프롬프트를 생성하기 위해 LLM이 EA 연산자(교차/돌연변이)를 구현하도록 사용합니다.
개발 세트에서 후보를 평가하고 성능이 우수한 프롬프트를 보존합니다.
두 가지 EA(GA와 DE)로 EvoPrompt를 구현합니다.
프롬프트 일관성을 보존하면서 LLM이 EA 단계를 수행하도록 안내하는 지침을 제공합니다.

실험 결과

연구 질문

RQ1그라디언트나 매개변수에 접근하지 않고도 LLM이 이산 프롬프트를 생성하기 위해 진화 연산자를 구현할 수 있는가?
RQ2작업 및 모델 전반에서 EvoPrompt 기반 프롬프트가 사람 설계 프롬프트 및 이전의 자동 방법을 능가하는가?
RQ3다른 프롬프트 기초값이나 데이터세트에서 어느 EA 변형(GA 대 DE)이 더 나은 결과를 내는가?

주요 결과

Method	SST-2	CR	MR	SST-5	AG’s News	TREC	Subj	Avg.
MI (Zhang et al., 2023b)	93.68	91.40	88.75	42.90	70.63	50.60	49.75	71.07
NI (Mishra et al., 2022c)	92.86	90.90	89.60	48.64	48.89	55.00	52.55	68.21
PromptSource (Bach et al., 2022)	93.03	-	-	-	45.43	36.20	-	-
APE (Zhou et al., 2022)	94.01	90.50	90.90	46.97	71.18	59.60	63.25	73.77
EvoPrompt (GA)	94.84	91.20	90.40	49.37	73.42	63.80	67.90	75.85
EvoPrompt (DE)	94.84	91.35	90.15	48.19	73.33	64.40	77.60	77.12

EvoPrompt with GA and DE outperforms manual prompts and APE on language understanding tasks with Alpaca-7b and GPT-3.5.
On sentiment classification, EvoPrompt (GA) slightly outperforms EvoPrompt (DE); on subjectivity classification, EvoPrompt (DE) significantly outperforms EvoPrompt (GA).
In language generation, EvoPrompt (GA) and EvoPrompt (DE) improve ROUGE scores on summarization and SARI on simplification, with DE often leading to higher gains.
DE generally helps escape local optima compared to GA, especially when initial prompts are weak or diverse.
The approach does not require access to LLM parameters or gradients and yields human-readable prompts.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.