Skip to main content
QUICK REVIEW

[论文解读] Automated Sizing and Training of Efficient Deep Autoencoders using Second Order Algorithms

Kanishka Tyagi, Chinmay Rane|arXiv (Cornell University)|Aug 11, 2023
Neural Networks and Applications被引用 8
一句话总结

论文介绍了一种多步训练框架,使用二阶方法为广义线性分类器和深度自编码器训练,包含输入裁剪、输出增强与缩放,以在桌面资源上构建快速高效的模型。

ABSTRACT

We propose a multi-step training method for designing generalized linear classifiers. First, an initial multi-class linear classifier is found through regression. Then validation error is minimized by pruning of unnecessary inputs. Simultaneously, desired outputs are improved via a method similar to the Ho-Kashyap rule. Next, the output discriminants are scaled to be net functions of sigmoidal output units in a generalized linear classifier. We then develop a family of batch training algorithm for the multi layer perceptron that optimizes its hidden layer size and number of training epochs. Next, we combine pruning with a growing approach. Later, the input units are scaled to be the net function of the sigmoidal output units that are then feed into as input to the MLP. We then propose resulting improvements in each of the deep learning blocks thereby improving the overall performance of the deep architecture. We discuss the principles and formulation regarding learning algorithms for deep autoencoders. We investigate several problems in deep autoencoders networks including training issues, the theoretical, mathematical and experimental justification that the networks are linear, optimizing the number of hidden units in each layer and determining the depth of the deep learning model. A direct implication of the current work is the ability to construct fast deep learning models using desktop level computational resources. This, in our opinion, promotes our design philosophy of building small but powerful algorithms. Performance gains are demonstrated at each step. Using widely available datasets, the final network's ten fold testing error is shown to be less than that of several other linear, generalized linear classifiers, multi layer perceptron and deep learners reported in the literature.

研究动机与目标

  • 设计在桌面资源上高效训练的广义线性分类器和深度自编码器的动机。
  • 开发一个多步训练管线,结合基于回归的初始化、裁剪、输出整形与缩放,以提升判别能力。
  • 研究训练问题并给出线性、隐藏单元优化与深度在深度自编码器中的理论/经验依据。

提出的方法

  • 通过回归找到初始的多分类线性分类器。
  • 通过裁剪不必要的输入来最小化验证误差。
  • 使用类似Ho-Kashyap的方法改进输出。
  • 将输出判别以网络函数化的S形输出单元进行缩放。
  • 开发批量训练算法以优化隐藏层大小和训练轮次。
  • 将裁剪与增长策略相结合,并对输入单元进行缩放以输入MLP。
  • 讨论深度自编码器中学习算法的原则,并就线性、隐藏单元优化与深度的理论、经验依据进行探讨与解决训练问题。

实验结果

研究问题

  • RQ1裁剪、输入/输出缩放如何改善广义线性分类器和深度自编码器的性能?
  • RQ2在深度结构中,针对MLP隐藏单元和训练轮次,哪些二阶、基于批的训练 regime 最为有效?
  • RQ3裁剪、增长与缩放如何影响深度自编码器训练中的深度与线性假设?
  • RQ4快速、适合桌面资源的训练能否产生与现有线性、广义线性、MLP和深度学习方法相比具有竞争力甚至更优的性能?

主要发现

  • 多步训练方法可以提升广义线性分类器和深度自编码器的性能。
  • 在所 proposed 的训练流程的每一步都能看到性能提升。
  • 最终网络的十倍测试误差低于文献中报道的若干其他线性、广义线性分类器、MLP和深度学习模型。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。