[论文解读] A Survey of Large Language Models
本综述回顾大型语言模型(LLMs)的最新进展,涵盖背景、扩展定律、涌现能力、预训练、适应、使用、对齐与评估的技术,并总结可用资源与未来方向。
Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.
研究动机与目标
- 总结 LLMs 从统计到基于 transformer 的演变与意义,并将 LLM 的范围定义为在海量文本数据上训练、具有数千亿参数的模型。
- 综合 LLM 的四个关键方面:预训练、适应性微调、利用和容量评估。
- 突出扩展定律、涌现能力,以及促进 LLM 能力的实用技术。
- 提供一个最新的资源指南(如 GitHub 项目),并讨论尚待解决的挑战与未来方向。
提出的方法
- 讨论 LLMs 及 GPT 系列模型的背景与演变。
- 提出并解释扩展定律(KM 扩展定律和 Chinchilla 扩展定律)及它们对模型、数据、计算的影响。
- 描述 LLM 的涌现能力(上下文学习、指令遵循、逐步推理)及它们与扩展之间的关系。
- 概述关键技术,如扩展、分布式训练、能力 eliciting、对齐微调,以及工具/插件集成。
- 总结实际资源和未解决的挑战,以指导未来的研究与开发。
实验结果
研究问题
- RQ1相较于先前的预训练语言模型,大型语言模型的定义特征和能力是什么?
- RQ2扩展定律如何将模型大小、数据和计算与性能联系起来,以及它们对训练 LLM 的实际影响?
- RQ3哪些技术能够引发涌现能力,以及对齐和工具如何提升 LLM 的实用性与安全性?
- RQ4哪些资源(数据、工具、平台)支持 LLM 的开发、评估和部署,以及确定了哪些未来方向?
主要发现
- LLMs 表现出涌现能力(上下文学习、指令遵循和逐步推理),这些在较小的模型中并不存在。
- 两条典型的扩展定律(KM 定律和 Chinchilla 定律)描述了模型规模、数据和计算如何影响性能及最优分配。
- LLMs 的训练与部署依赖分布式训练框架、优化技巧以及对齐技术,包括带人类反馈的强化学习和指令微调。
- 外部工具和插件将 LLM 能力扩展到文本生成之外,解决如信息时效性和数值准确性等限制。
- 该综述提供了一个精心挑选的 GitHub 资源及其支撑材料,并讨论持续的挑战,如数据质量、与人类价值观的一致性以及可解释性。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。