[论文解读] Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango
论文使用反事实提示来解析链式思维提示,展示文本与模式的共生作用,并引入 CCOT 在不影响性能的情况下裁剪提示。
The past decade has witnessed dramatic gains in natural language processing and an unprecedented scaling of large language models. These developments have been accelerated by the advent of few-shot techniques such as chain of thought (CoT) prompting. Specifically, CoT pushes the performance of large language models in a few-shot setup by augmenting the prompts with intermediate steps. Despite impressive results across various tasks, the reasons behind their success have not been explored. This work uses counterfactual prompting to develop a deeper understanding of CoT-based few-shot prompting mechanisms in large language models. We first systematically identify and define the key components of a prompt: symbols, patterns, and text. Then, we devise and conduct an exhaustive set of experiments across four different tasks, by querying the model with counterfactual prompts where only one of these components is altered. Our experiments across three models (PaLM, GPT-3, and CODEX) reveal several surprising findings and brings into question the conventional wisdom around few-shot prompting. First, the presence of factual patterns in a prompt is practically immaterial to the success of CoT. Second, our results conclude that the primary role of intermediate steps may not be to facilitate learning how to solve a task. The intermediate steps are rather a beacon for the model to realize what symbols to replicate in the output to form a factual answer. Further, text imbues patterns with commonsense knowledge and meaning. Our empirical and qualitative analysis reveals that a symbiotic relationship between text and patterns explains the success of few-shot prompting: text helps extract commonsense from the question to help patterns, and patterns enforce task understanding and direct text generation.
研究动机与目标
- 识别少量示例提示的关键语义组件:符号、模式和文本。
- 通过反事实提示在多项任务和多种LLM上通过实证区分各组成部分的作用。
- 展示文本与模式对链式思维成功的相互依赖影响。
- 提出一种简洁的提示方案 CCOT,在保持或提升性能的同时将标记数减少约20%。
提出的方法
- 定义三个语义提示组件:符号、模式和文本。
- 设计仅同时只改变一个组件的反事实提示以测量影响。
- 在 PaLM-62B、GPT-3 和 CODEX 的四个推理任务上评估提示。
- 分析注意力模式以理解不同提示下的推理机制。
实验结果
研究问题
- RQ1符号、模式和文本在链式思维提示成功中的作用是什么?
- RQ2对每个组成部分的反事实改变如何影响跨任务和跨模型的性能?
- RQ3简洁的提示方法是否能在减少提示长度(CCOT)的同时保持性能?
- RQ4文本-模式相互作用如何解释大语言模型少样本推理的有效性?
主要发现
- 符号的确切类型对性能几乎没有影响;抽象占位符也可以有效。
- 模式是必要但不充分;没有模式,COT 的表现不佳,甚至降级为直接提示。
- 文本对于提供常识和语义以使模式引导生成至关重要。
- 文本与模式形成共生关系,基本解释了 COT 的成功,每者贡献互补信息。
- CCOT 在不损害且有时提升任务解决率的前提下,将标记使用量减少约20%。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。