Skip to main content
QUICK REVIEW

[论文解读] The Prompt Report: A Systematic Survey of Prompt Engineering Techniques

Sander Schulhoff, Michael Ilie|arXiv (Cornell University)|Jun 6, 2024
Artificial Intelligence in Healthcare and Education被引用 72
一句话总结

本文提供了对提示技术的系统性综述,构建了58种文本提示方法和40种多模态提示方法的分类法,以及一个33术语词汇表,使用基于PRISMA的文献综述。

ABSTRACT

Generative Artificial Intelligence (GenAI) systems are increasingly being deployed across diverse industries and research domains. Developers and end-users interact with these systems through the use of prompting and prompt engineering. Although prompt engineering is a widely adopted and extensively researched area, it suffers from conflicting terminology and a fragmented ontological understanding of what constitutes an effective prompt due to its relatively recent emergence. We establish a structured understanding of prompt engineering by assembling a taxonomy of prompting techniques and analyzing their applications. We present a detailed vocabulary of 33 vocabulary terms, a taxonomy of 58 LLM prompting techniques, and 40 techniques for other modalities. Additionally, we provide best practices and guidelines for prompt engineering, including advice for prompting state-of-the-art (SOTA) LLMs such as ChatGPT. We further present a meta-analysis of the entire literature on natural language prefix-prompting. As a culmination of these efforts, this paper presents the most comprehensive survey on prompt engineering to date.

研究动机与目标

  • 建立一个结构化、广泛可用的提示研究与实践词汇表。
  • 创建覆盖文本及其他模态的提示技术的全面分类。
  • 总结文献中提示方法的演变与使用模式。

提出的方法

  • 进行基于PRISMA的系统综述,从arXiv、Semantic Scholar和ACL收集提示文献。
  • 整理提示技术的分类法(58种文本为基、40种多模态)以及一个33术语的词汇表。
  • 综合基准、多语言和多模态背景下的发现,以及提示扩展如代理和评估框架等。
Figure 1.1: Categories within the field of prompting are interconnected. We discuss 7 core categories that are well described by the papers within our scope.
Figure 1.1: Categories within the field of prompting are interconnected. We discuss 7 core categories that are well described by the papers within our scope.

实验结果

研究问题

  • RQ1在GenAI研究中,定义提示的核心组成部分和术语是什么?
  • RQ2存在哪些提示技术,如何将它们归类为一个稳健的分类体系?
  • RQ3提示技术如何在不同语言和模态中应用,存在哪些扩展(代理、评估)?
  • RQ4在提示中常见的评估做法以及安全/风险考量有哪些?
  • RQ5提示技术在基准测试和实际案例研究中的表现如何?ynet?

主要发现

  • 提出了一个58种文本提示技术的分类法,分为六大类。
  • 另一组40种提示技术覆盖非文本模态(多模态提示)。
  • 提供了一个33个提示术语的词汇表,以标准化术语。
  • 研究遵循基于PRISMA的过程,并包含对自然语言前缀提示的元分析。
  • 本研究在提示研究中讨论了代理、评估、安全、对齐和基准测试。
  • 两个案例研究在如MMLU的基准测试和一个与现实世界危机相关的任务上展示了提示技术。
Figure 1.4: The Prompt Engineering Process consists of three repeated steps 1) performing inference on a dataset 2) evaluating performance and 3) modifying the prompt template. Note that the extractor is used to extract a final response from the LLM output (e.g. "This phrase is positive" $\rightarro
Figure 1.4: The Prompt Engineering Process consists of three repeated steps 1) performing inference on a dataset 2) evaluating performance and 3) modifying the prompt template. Note that the extractor is used to extract a final response from the LLM output (e.g. "This phrase is positive" $\rightarro

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。