QUICK REVIEW

[论文解读] Kronfluence: Influence Functions with Eigenvalue-corrected Kronecker-Factored Approximate Curvature

Grosse, Roger, Bae, Juhan|arXiv (Cornell University)|Aug 7, 2023

Topic Modeling被引用 24

一句话总结

本文将影响函数扩展到大型语言模型（参数规模高达 52B），使用 EK-FAC 进行 IHVP；将准确性与传统方法进行对比验证，并分析 LLM 的泛化模式和影响分布。

ABSTRACT

When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is: which training examples most contribute to a given behavior? Influence functions aim to answer a counterfactual: how would the model's parameters (and hence its outputs) change if a given sequence were added to the training set? While influence functions have produced insights for small models, they are difficult to scale to large language models (LLMs) due to the difficulty of computing an inverse-Hessian-vector product (IHVP). We use the Eigenvalue-corrected Kronecker-Factored Approximate Curvature (EK-FAC) approximation to scale influence functions up to LLMs with up to 52 billion parameters. In our experiments, EK-FAC achieves similar accuracy to traditional influence function estimators despite the IHVP computation being orders of magnitude faster. We investigate two algorithmic techniques to reduce the cost of computing gradients of candidate training sequences: TF-IDF filtering and query batching. We use influence functions to investigate the generalization patterns of LLMs, including the sparsity of the influence patterns, increasing abstraction with scale, math and programming abilities, cross-lingual generalization, and role-playing behavior. Despite many apparently sophisticated forms of generalization, we identify a surprising limitation: influences decay to near-zero when the order of key phrases is flipped. Overall, influence functions give us a powerful new tool for studying the generalization properties of LLMs.

研究动机与目标

研究训练序列如何通过影响函数影响大型语言模型。
将影响函数计算扩大到 Transformer 结构的 LLM，参数规模高达 52B，使用 EK-FAC。
验证 EK-FAC 作为传统 IHVP 方法的快速、准确替代方案。
分析 LLMs 的泛化模式，包括稀疏性、抽象化、记忆化和跨语言行为。
考察在大型模型中单词排序和角色扮演行为的出现机制。

提出的方法

使用特征值校正的 Kronecker-Factored 近似曲率（EK-FAC）来近似用于 IHVP 计算的海森矩阵。
采用近端 Bregman 响应函数（PBRF）结构来处理未收敛或参数过多的模型。
引入查询批处理以在多个影响查询之间共享梯度计算。
使用 TF-IDF 过滤来降低候选训练序列的梯度成本。
将 K-FAC 适配到 Transformer/MLP 层，以实现高效的 G^{-1} 向量乘积。
提供逐层和逐标记的归因分析，以在网络中定位影响。

实验结果

研究问题

RQ1与传统方法相比，EK-FAC 在近似逆海森-向量积方面的效果如何？
RQ2大型语言模型中影响分数的分布特性是什么（例如稀疏性、尾部行为）？
RQ3随着模型规模的扩大，泛化模式如何演变（例如抽象、记忆、跨语言泛化、角色扮演）？
RQ4网络中的哪些层/位置的序列具有较大影响，且这如何与标记级归因相关？
RQ5单词排序和模仿与计划之间在多复杂行为中的影响序列上有多大解释力？

主要发现

EK-FAC 在影响估计方面具有竞争力，IHVP 计算比传统方法快上数量级。
影响分布呈厚尾且分布在大量序列上，而不是集中在少数记忆样本。
更大的模型在更抽象的层次上泛化，具备如编程、数学推理和跨语言泛化等高级能力。
影响在各层大致均匀分布，中间层捕捉更抽象的模式，上层/下层更接近标记。
单词排序对影响有决定性影响：序列只有在相关短语按特定顺序出现时才对模型产生影响（提示在完成之前）。
角色扮演行为似乎在很大程度上是模仿性的，由训练数据中类似行为的示例驱动。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。