QUICK REVIEW

[论文解读] Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

Minki Kang, Seanie Lee|arXiv (Cornell University)|May 28, 2023

Topic Modeling被引用 21

一句话总结

KARD 将来自大型 LLMs 的推理蒸馏到通过外部知识增强的小型 LMs，使用神经重新排序器检索与推理相关的段落，并在知识密集型的问答基准上取得了强劲的表现。

ABSTRACT

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy. Previous studies have focused on building task-specific small Language Models (LMs) by fine-tuning them with labeled data or distilling LLMs. However, these approaches are ill-suited for knowledge-intensive reasoning tasks due to the limited capacity of small LMs in memorizing the knowledge required. Motivated by our theoretical analysis on memorization, we propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales obtained from LLMs with augmented knowledge retrieved from an external knowledge base. Moreover, we further propose a neural reranker to obtain documents relevant to rationale generation. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets, namely MedQA-USMLE, StrategyQA, and OpenbookQA. Notably, our method makes the 250M T5 models achieve superior performance against the fine-tuned 3B models, having 12 times larger parameters, on both MedQA-USMLE and StrategyQA benchmarks.

研究动机与目标

出于隐私和计算约束的原因，说明在知识密集型任务中使用小型 LM 的必要性。
提出一个框架，将 LLM 的推理蒸馏到小型 LMs，同时用外部知识 KB 段落进行增强。
在推理阶段引入一个神经重新排序器，以检索与推理生成相关的段落。
证明在 MedQA-USMLE、StrategyQA 和 OpenBookQA 上，KARD 相较基线提高性能。

提出的方法

使用 LLM 通过逐步推理提示（chain-of-thought）来生成用于训练数据的推理过程。
对小型 LM 进行微调，使其在给定问题的条件下生成推理与答案。
用检索到的 KB 段落（LKB）进行小型 LM 的训练增强，使用推理作为查询。
引入一个神经重新排序器以重新排序检索到的段落，使其与推理生成更相关。
训练重新排序器模仿检索器对推理的排序，采用 KL 散度目标。
在推理阶段，检索段落、重新排序、生成推理并给出最终答案。

实验结果

研究问题

RQ1知识增强蒸馏是否能有效将 LLM 推理迁移到小型 LM，以应对知识密集型任务？
RQ2在标准推理蒸馏基础上，增加外部知识与重新排序器是否能提升小型 LM 的性能？
RQ3KARD 与基线（少量示例、微调、标准推理蒸馏）在医疗与多模态推理基准上有何比较？

主要发现

KARD 在 MedQA-USMLE、StrategyQA 和 OpenBookQA 上不同模型规模均持续优于基线。
知识增强降低了小型 LM 的记忆需求，使参数更少时也能获得更好表现。
神经重新排序器提高了与推理生成相关性段落的相关性，产生的下游答案优于 BM25 检索。
KARD 在较小模型（如 250M 参数）上也能带来显著提升，有时甚至超过更大微调模型。
DAPT 相较于 KARD 的增益有限，凸显了知识增强在推理蒸馏中的独特价值。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。