QUICK REVIEW

[论文解读] A Survey of Lottery Ticket Hypothesis

Bohan Liu, Zijie Zhang|arXiv (Cornell University)|Mar 7, 2024

Gambling Behavior and Treatments被引用 5

一句话总结

本论文综述 Lottery Ticket Hypothesis (LTH)，总结理论、特定模型、算法、实验、效率、与其他主题的关系，以及应用，并讨论未解决的问题和未来方向。

ABSTRACT

The Lottery Ticket Hypothesis (LTH) states that a dense neural network model contains a highly sparse subnetwork (i.e., winning tickets) that can achieve even better performance than the original model when trained in isolation. While LTH has been proved both empirically and theoretically in many works, there still are some open issues, such as efficiency and scalability, to be addressed. Also, the lack of open-source frameworks and consensual experimental setting poses a challenge to future research on LTH. We, for the first time, examine previous research and studies on LTH from different perspectives. We also discuss issues in existing works and list potential directions for further exploration. This survey aims to provide an in-depth look at the state of LTH and develop a duly maintained platform to conduct experiments and compare with the most updated baselines.

研究动机与目标

解释 Lottery Ticket Hypothesis 及其理论基础。
对多种模型类型与任务中的 LTH 研究进行分类与综合。
识别 LTH 的实际挑战、效率问题与可扩展性问题。
回顾用于发现 winning tickets 的算法以及初始化/裁剪策略。
提出推进 LTH 研究可重复性与可比性的方向与基准。

提出的方法

提供一个正式的符号框架，涵盖 pretraining、pruning 与 reinitialization。
总结并比较迭代幅度裁剪（iterative magnitude-based pruning）和一次性裁剪（one-shot pruning）方法。
综述证明 winning tickets 和 strong lottery tickets 存在的理论结果。
整理 LTH 在特殊模型中的变体，如 GNNs、Transformers 与生成模型。
回顾关于裁剪程度、逐层与全局裁剪，以及零值、符号和超掩码等关键因素的实验洞见。
给出 LTH 文献的分类体系并总结公开代码库与基准数据集。

实验结果

研究问题

RQ1当前关于跨体系结构和任务的 winning tickets 存在的证据有哪些？
RQ2初始化、裁剪策略和再训练在实际中如何影响 lottery tickets 的成功？
RQ3如何将 LTH 扩展到特殊模型（GNNs、Transformers、GANs/VAEs）及新的领域？
RQ4在发现和使用 lottery tickets 过程中的主要效率和可扩展性瓶颈是什么，如何缓解？

主要发现

winning tickets 可以在密集网络中存在，并且在从初始化重新训练时可以与原始模型相匹配甚至优于之。
在某些形式下，存在 strong lottery tickets，其中子网络在不训练的情况下也能表现良好。
特殊模型（GNNs、Transformers、生成模型）需要定制的 LTH 形式，并显示出不同的可迁移性和效率提升。
包括权重回卷（weight rewinding）和学习率回卷（learning rate rewinding）在内的各种初始化和裁剪策略显著影响 ticket 质量。
公认需要标准化的实验设置、可重复性基准和用于公平比较的开源平台。
裁剪效率和面向硬件的结构化裁剪仍然是将理论与实际部署衔接的活跃领域。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。