QUICK REVIEW

[论文解读] Training Compact Neural Networks via Auxiliary Overparameterization.

Yifan Liu, Bohan Zhuang|arXiv (Cornell University)|Sep 5, 2019

Machine Learning and Data Classification被引用 7

一句话总结

该论文提出了一种辅助超参数化模块，可在训练期间扩展紧凑神经网络以改善优化和泛化性能，同时在推理时仅保留原始的紧凑网络。通过自动搜索分层辅助结构，该方法在不增加推理时成本的前提下，实现了与完全超参数化模型相当的性能提升。

ABSTRACT

It is observed that overparameterization (i.e., designing neural networks whose number of parameters is larger than statistically needed to fit the training data) can improve both optimization and generalization while compact networks are more difficult to be optimized. However, overparameterization leads to slower test-time inference speed and more power consumption. To tackle this problem, we propose a novel auxiliary module to simulate the effect of overparameterization. During training, we expand the compact network with the auxiliary module to formulate a wider network to assist optimization while during inference only the original compact network is kept. Moreover, we propose to automatically search the hierarchical auxiliary structure to avoid adding supervisions heuristically. In experiments, we explore several challenging resource constraint tasks including light-weight classification, semantic segmentation and multi-task learning with hard parameter sharing. We empirically find that the proposed auxiliary module can maintain the complexity of the compact network while significantly improving the performance.

研究动机与目标

为解决紧凑神经网络的优化挑战，尽管其效率高但难以训练。
在不增加推理时模型大小的前提下，模拟超参数化的优势——改善优化和泛化。
自动化分层辅助结构的设计，避免在辅助模块构建中依赖启发式方法。
在资源受限的场景下实现紧凑模型的高效训练，如轻量级分类、语义分割和多任务学习。

提出的方法

引入一个辅助模块，在训练期间扩展紧凑网络，以构建更宽、更易训练的架构。
使用标准反向传播训练扩展后的网络，利用辅助模块的容量来缓解优化难度。
在推理时丢弃辅助模块，仅使用原始紧凑网络，从而保持效率。
提出一种可微搜索机制，自动学习辅助模块的分层结构。
设计一个允许灵活、结构化超参数化的搜索空间，同时保持计算可行性。
将该方法应用于多种具有严格资源约束的任务，包括多任务学习中的参数共享。

实验结果

研究问题

RQ1辅助超参数化是否能在不增加推理成本的前提下改善紧凑神经网络的训练动态？
RQ2与手工设计的结构相比，自动搜索分层辅助结构的效率如何？
RQ3该方法在资源受限任务中，能在多大程度上缩小紧凑模型与超参数化模型之间的性能差距？
RQ4该方法是否在图像分类、语义分割和多任务学习等多样化任务中具有泛化能力？

主要发现

所提方法在多个任务中显著提升了紧凑网络的性能，达到或超过完全超参数化模型的水平。
辅助超参数化即使在基础网络较小且难以训练的情况下，也能实现更好的优化和泛化。
自动搜索辅助结构相比启发式设计取得了更优性能，证明了搜索机制的有效性。
该方法保持了原始紧凑模型的推理速度和内存占用，适用于边缘设备部署。
实证结果表明，在轻量级分类、语义分割和采用硬参数共享的多任务学习中，准确率持续提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。