QUICK REVIEW

[论文解读] Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey

Zeyu Han, Chao Gao|arXiv (Cornell University)|Mar 21, 2024

Parallel Computing and Optimization Techniques被引用 90

一句话总结

本论文综述了大模型的参数高效微调（PEFT）方法，对算法进行分类，分析性能与系统成本，并概述跨模态的实际应用。

ABSTRACT

Large models represent a groundbreaking advancement in multiple application fields, enabling remarkable achievements across various tasks. However, their unprecedented scale comes with significant computational costs. These models, often consisting of billions of parameters, require vast amounts of computational resources for execution. Especially, the expansive scale and computational demands pose considerable challenges when customizing them for particular downstream tasks, particularly over the hardware platforms constrained by computational capabilities. Parameter Efficient Fine-Tuning (PEFT) provides a practical solution by efficiently adjusting the large models over the various downstream tasks. In particular, PEFT refers to the process of adjusting the parameters of a pre-trained large model to adapt it to a specific task or domain while minimizing the number of additional parameters introduced or computational resources required. This approach is particularly important when dealing with large-scale language models with high parameter counts, as fine-tuning these models from scratch can be computationally expensive and resource-intensive, posing considerable challenges in the supporting system platform design. In this survey, we present comprehensive studies of various PEFT algorithms, examining their performance and computational overhead. Moreover, we provide an overview of applications developed using different PEFT algorithms and discuss common techniques employed to mitigate computation costs for PEFT. In addition to providing an extensive survey from an algorithmic standpoint, we also examine various real-world system designs to investigate the implementation costs associated with different PEFT approaches. This survey serves as a valuable resource for researchers aiming to understand both the PEFT algorithm and its system implementation, offering detailed ......

研究动机与目标

动员/阐明将大规模预训练模型高效地适应下游任务的必要性。
系统性地对PEFT算法及其核心机制进行分类。
评估PEFT方法的计算开销及实际的系统影响。
突出PEFT在自然语言处理、视觉和多模态模型中的应用，并讨论部署方面的考虑。

提出的方法

将PEFT方法分为加法、选择性、再参数化和混合类别。
在每个类别中详细介绍代表性算法（如适配器、软提示、剪枝、LoRA及其变体）。
描述这些方法如何修改或利用模型参数以实现高效。
分析降低计算成本的技术（KV缓存管理、剪枝、量化、内存优化）。
讨论部署PEFT时的架构和系统层面的考虑（分布式微调、查询服务、并发微调）。
评述在大型语言模型、视觉变换器、视觉-语言模型和扩散模型中的应用。

实验结果

研究问题

RQ1PEFT算法的主要家族及其特征机制是什么？
RQ2在参数高效性和性能方面，加法、选择性、再参数化和混合PEFT方法的比较如何？
RQ3在不同模型家族和任务中，PEFT的实际系统成本和部署考虑因素是什么？
RQ4PEFT在哪些关键应用领域和模型架构中最具影响力？

主要发现

PEFT方法分为四大类：加法、选择性、再参数化和混合，每一类都有独特的设计权衡。
适配器、软提示和其他加法技术为全微调提供参数高效的替代方案，具有不同的效率和准确性特征。
选择性PEFT使用掩码或结构化剪枝来微调参数的子集，从而提高硬件效率和可扩展性。
再参数化PEFT，特别是LoRA及其变体，通过学习低秩更新同时保持快速推理，实现较强的效率。
混合PEFT结合多种家族的思路以在性能和效率之间取得平衡。
该综述还涵盖系统层面的考虑，包括分布式调优、PEFT查询服务和并发调优，强调实际部署成本和约束。
应用覆盖大型语言模型、Vision Transformers、视觉-语言模型和扩散模型，展示PEFT在跨模态中的多样性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。