Skip to main content
QUICK REVIEW

[论文解读] Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks

Haoran You, Chaojian Li|arXiv (Cornell University)|Apr 30, 2020
Advanced Neural Network Applications参考文献 32被引用 78
一句话总结

本文提出了早期鸟(EB)票证——一种可通过低成本方法(如早停和高学习率下的低精度训练)在训练初期即可识别出的关键子网络。通过引入一种新型掩码距离度量,EB票证可在无需完整训练的情况下高效检测,从而在保持或提升模型准确率的同时,实现高达4.7倍的能效提升。

ABSTRACT

(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve comparable accuracies to the latter in a similar number of iterations. However, the identification of these winning tickets still requires the costly train-prune-retrain process, limiting their practical benefits. In this paper, we discover for the first time that the winning tickets can be identified at the very early training stage, which we term as early-bird (EB) tickets, via low-cost training schemes (e.g., early stopping and low-precision training) at large learning rates. Our finding of EB tickets is consistent with recently reported observations that the key connectivity patterns of neural networks emerge early. Furthermore, we propose a mask distance metric that can be used to identify EB tickets with low computational overhead, without needing to know the true winning tickets that emerge after the full training. Finally, we leverage the existence of EB tickets and the proposed mask distance to develop efficient training methods, which are achieved by first identifying EB tickets via low-cost schemes, and then continuing to train merely the EB tickets towards the target accuracy. Experiments based on various deep networks and datasets validate: 1) the existence of EB tickets, and the effectiveness of mask distance in efficiently identifying them; and 2) that the proposed efficient training via EB tickets can achieve up to 4.7x energy savings while maintaining comparable or even better accuracy, demonstrating a promising and easily adopted method for tackling cost-prohibitive deep network training.

研究动机与目标

  • 为解决通过标准的训练-剪枝-微调流程识别胜出票证所带来的高计算成本问题。
  • 探究是否可在模型完全收敛之前,更早地发现关键子网络(胜出票证)。
  • 开发一种低开销的方法,用于识别这些早期子网络,且无需事先知晓最终的胜出票证。
  • 通过仅聚焦于识别出的早期鸟票证实现高效训练,从而显著降低能耗与计算成本。

提出的方法

  • 提出早期鸟(EB)票证作为在独立训练时即可达到高准确率的子网络,并在训练开始的极早期阶段即可被发现。
  • 利用低成本训练方案(如早停和高学习率下的低精度训练)高效识别EB票证。
  • 引入一种掩码距离度量,用于比较训练过程中各步骤的子网络结构,从而在无需访问最终胜出票证的情况下检测EB票证。
  • 应用掩码距离度量,早期筛选出最有潜力的子网络,随后仅对这些子网络继续训练,以达到目标准确率。
  • 利用识别出的EB票证实现高效训练,避免对完整网络进行重新训练。

实验结果

研究问题

  • RQ1是否可在训练的极早期阶段(在完全收敛之前)识别出胜出票证?
  • RQ2如早停和高学习率下的低精度训练等低成本训练方案,是否能实现关键子网络的早期检测?
  • RQ3掩码距离度量是否能可靠地识别EB票证,而无需依赖最终胜出票证的信息?
  • RQ4仅对识别出的EB票证进行训练,是否能以显著降低的能耗与计算量,实现相当或更优的准确率?

主要发现

  • 早期鸟票证确实存在,并可通过早停和高学习率下的低精度训练等低成本训练方案被识别。
  • 所提出的掩码距离度量可在计算开销极低的情况下准确识别EB票证,且无需依赖最终胜出票证的信息。
  • 与标准训练相比,该方法在保持或提升模型准确率的同时,实现了高达4.7倍的能耗节省。
  • 在多种深度神经网络和数据集上的实验验证了EB票证及掩码距离度量在不同架构和任务中的有效性。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。