QUICK REVIEW

[论文解读] Anytime Neural Network: a Versatile Trade-off Between Computation and Accuracy

Hanzhang Hu, Debadeepta Dey|arXiv (Cornell University)|Feb 12, 2018

Advanced Neural Network Applications被引用 6

一句话总结

本文提出即时神经网络（Anytime Neural Networks, ANNs），通过将辅助预测集成到深度神经网络中，实现在不同计算预算下对输出进行持续优化。通过在训练过程中联合优化这些辅助头，并采用振荡的损失权重，ANNs 在几乎不增加额外计算量的前提下，实现了最先进的即时性能，保持最终准确率的同时，可在任意预算级别实现早期退出。

ABSTRACT

Anytime predictors first produce crude results quickly, and then continuously refine them until the test-time computational budget is depleted. Such predictors are used in real-time vision systems and streaming-data processing to efficiently utilize varying test-time budgets, and to reduce average prediction cost via early-exits. However, anytime prediction algorithms have difficulties utilizing the accurate predictions of deep neural networks (DNNs), because DNNs are often computationally expensive without competitive intermediate results. In this work, we propose to add auxiliary predictions in DNNs to generate anytime predictions, and optimize these predictions simultaneously by minimizing a carefully constructed weighted sum of losses, where the weights also oscillate during training. The proposed anytime neural networks (ANNs) produce reasonable anytime predictions without sacrificing the final performance or incurring noticeable extra computation. This enables us to assemble a sequence of exponentially deepening ANNs, and it achieves, both theoretically and practically, near-optimal anytime predictions at every budget after spending a constant fraction of extra cost. The proposed methods are shown to produce anytime predictions at the state-of-the-art level on visual recognition data-sets, including ILSVRC2012.

研究动机与目标

为解决将即时预测集成到深度神经网络中的挑战，而这类网络通常缺乏准确的中间输出。
使实时系统能够在计算资源动态变化的条件下，逐步生成更精确的预测。
在引入辅助头以实现早期退出的同时，保持最终模型的准确率，且不带来显著的计算开销。
通过一种新颖的训练策略（采用振荡的损失权重），在所有预算水平下实现接近最优的即时性能。

提出的方法

在深度神经网络的不同深度处集成多个辅助分类头，以生成中间预测。
通过最小化损失加权和来同时优化所有头，其中权重在训练过程中振荡，以平衡早期预测与最终预测。
采用一种动态损失加权方案，交替强调早期头与最终头，以提升中间和最终预测的准确性。
构建一系列呈指数级加深的 ANNs，其中每个后续网络均基于前一个网络构建，以扩展即时预测能力。
端到端训练网络，使用复合损失函数，平衡所有阶段的准确性，并确保最终性能不退化。
利用深度网络的结构特性，实现在任意阶段实现早期退出，且额外计算量极小。

实验结果

研究问题

RQ1能否有效将辅助头集成到深度神经网络中，实现在不降低最终准确率的前提下实现即时预测？
RQ2如何设计训练策略，以同时优化中间预测与最终预测，从而支持早期退出？
RQ3在 DNN 中实现即时预测的计算成本是多少？是否能保持在极低水平？
RQ4所提方法是否能在所有预算水平下实现接近最优的即时性能？
RQ5所提方法在标准视觉基准上与现有即时预测方法相比表现如何？

主要发现

所提出的 ANNs 在 ILSVRC2012 及其他视觉识别数据集上实现了最先进的即时预测性能。
与标准 DNN 相比，该方法仅引入恒定比例的额外计算量，支持高效的即时推理。
训练过程中采用振荡损失权重显著提升了中间预测的质量，且未损害最终准确率。
该框架支持一系列呈指数级加深的网络，使在每个预算水平下均能实现接近最优的即时性能。
该方法可在任意阶段实现高置信度的早期退出，显著降低流式和实时系统中的平均预测成本。
实证结果证实，即使在计算资源受限的情况下，辅助头在所有阶段仍能提供合理的预测。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。