Skip to main content
QUICK REVIEW

[论文解读] Fast, simple and accurate handwritten digit classification using extreme learning machines with shaped input-weights.

Mark D. McDonnell, Migel D. Tissera|arXiv (Cornell University)|Dec 29, 2014
Machine Learning and ELM被引用 9
一句话总结

该论文提出了一种基于形状化输入权重的极限学习机(ELM)快速、准确的手写数字分类方法,通过随机采样局部图像块生成稀疏输入权重矩阵。该方法在MNIST上实现<1%错误率,在NORB上实现<5.5%错误率,训练时间不足10分钟,优于标准ELM,并在简单基准测试中媲美深度网络。

ABSTRACT

Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM) approach, which also enables a very rapid training time (~10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random `receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

研究动机与目标

  • 使用浅层、非卷积神经网络在MNIST和NORB基准上实现最先进性能。
  • 与标准ELM和传统深度学习方法相比,减少训练时间并提高准确性。
  • 探究结构化、稀疏输入权重矩阵在单隐层网络中对泛化能力和效率的影响。
  • 评估将ELM与最小化反向传播结合以减少隐层神经元数量的有效性。
  • 证明通过优化输入权重设计的浅层网络可在标准视觉任务中媲美深度架构。

提出的方法

  • 引入形状化输入权重,使每个隐层神经元仅关注输入图像中随机大小和位置的局部区域,模拟感受野特性。
  • 通过局部采样将约90%的权重设为零,强制输入权重矩阵保持稀疏。
  • 在训练过程中应用随机扭曲以增强数据,并进一步降低错误率。
  • 在ELM训练后使用单批次反向传播步骤微调输出权重,从而减少所需隐层神经元数量。
  • 采用单隐层前馈神经网络,输入权重随机初始化且固定,仅通过最小二乘解训练输出权重。
  • 通过约束输入权重模式优化ELM框架,以提升特征定位能力和泛化性能。

实验结果

研究问题

  • RQ1浅层、非卷积神经网络结合ELM训练是否能在无需深度架构的情况下实现在MNIST上的接近最先进性能?
  • RQ2局部化、稀疏输入权重初始化对分类准确率和训练速度有何影响?
  • RQ3少量反向传播迭代在减少隐层神经元数量的同时,能在多大程度上提升ELM性能?
  • RQ4所提方法是否可泛化至其他数据集(如NORB),并与标准ELM和深度网络进行比较?
  • RQ5通过随机扭曲进行数据增强是否能进一步提升ELM框架中的泛化能力?

主要发现

  • 所提ELM方法在MNIST基准上实现测试错误率低于1%,媲美深度卷积网络。
  • 训练时间不足10分钟,显著快于大多数深度学习方法。
  • 在NORB数据集上错误率降低至5.5%以下,证明其在MNIST之外也具备泛化能力。
  • 使用形状化、稀疏输入权重可提升性能,实现无需反向传播的局部特征提取。
  • 将ELM与单批次反向传播结合可减少所需隐层神经元数量,同时保持高准确率。
  • 训练过程中引入随机扭曲可进一步降低错误率,证实该方法的鲁棒性。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。