Skip to main content
QUICK REVIEW

[论文解读] Optimization of Convolutional Neural Network Using the Linearly Decreasing Weight Particle Swarm Optimization

Tatsuki Serizawa, Hamido Fujita|arXiv (Cornell University)|Jan 16, 2020
Metaheuristic Optimization Algorithms Research参考文献 15被引用 27
一句话总结

该论文提出线性递减权重粒子群优化(LDWPSO)以自动化卷积神经网络(CNN)的超参数调优,提升训练效率与准确率。在MNIST和CIFAR-10数据集上,LDWPSO-CNN分别实现了98.95%和69.37%的top-1准确率,显著优于基于标准LeNet-5的CNN模型在94.02%和28.07%的准确率。

ABSTRACT

Convolutional neural network (CNN) is one of the most frequently used deep learning techniques. Various forms of models have been proposed and im-proved for learning at CNN. When learning with CNN, it is necessary to determine the optimal hyperparameters. However, the number of hyperparameters is so large that it is difficult to do it manually, so much research has been done on automation. A method that uses metaheuristic algorithms is attracting attention in research on hyperparameter optimization. Metaheuristic algorithms are naturally inspired and include evolution strategies, genetic algorithms, antcolony optimization and particle swarm optimization. In particular, particle swarm optimization converges faster than genetic algorithms, and various models have been proposed. In this paper, we pro-pose CNN hyperparameter optimization with linearly decreasing weight particle swarm optimization (LDWPSO). In the experiment, the MNIST data set and CIFAR-10 data set, which are often used as benchmark data sets, are used. By opti-mizing CNN hyperparameters with LDWPSO, learning the MNIST and CIFAR-10 datasets, we compare the accuracy with a standard CNN based on LeNet-5. As a result, when using the MNIST dataset, the baseline CNN is 94.02% at the 5th epoch, compared to 98.95% for LDWPSO CNN, which improves accuracy. When using the CIFAR-10 dataset, the Baseline CNN is 28.07% at the 10th epoch, compared to 69.37% for the LDWPSO CNN, which greatly improves accuracy. This paper is presented at the 36th Annual Conference of the Japanese Society for Artificial In-telligence. The final version is available at the following URL: https://doi.org/10.11517/pjsai.JSAI2022.0_2S4IS2b03

研究动机与目标

  • 为解决CNN中手动超参数调优的挑战,由于高维性而难以实施。
  • 通过元启发式优化自动搜索最优超参数,提升CNN训练性能。
  • 评估LDWPSO——一种具有线性递减惯性权重的粒子群优化变体——在标准基准数据集上的有效性。
  • 比较LDWPSO-CNN与基于标准LeNet-5的CNN在准确率和收敛速度方面的表现。

提出的方法

  • 将粒子群优化(PSO)与线性递减惯性权重(LDWPSO)结合,以在超参数优化过程中平衡全局与局部搜索。
  • 将群集中的每个粒子表示为一组候选CNN超参数(例如,卷积核大小、初始学习率、滤波器数量)。
  • 基于固定训练轮次后的验证准确率定义适应度函数,以评估每个粒子的超参数配置性能。
  • 使用标准PSO方程更新粒子位置与速度,其中惯性权重随迭代次数线性递减,以增强收敛性。
  • 将LDWPSO与CNN架构(基于LeNet-5)集成,联合优化超参数并训练网络。
  • 运行固定轮次的优化,选择表现最佳的超参数组合用于最终评估。

实验结果

研究问题

  • RQ1LDWPSO能否有效优化CNN超参数,从而提升泛化能力与收敛性?
  • RQ2LDWPSO-CNN在MNIST和CIFAR-10数据集上与标准LeNet-5基线CNN相比表现如何?
  • RQ3PSO中的线性递减惯性权重是否能提升CNN超参数优化中的搜索效率与准确性?
  • RQ4与基线CNN相比,LDWPSO对训练速度和最终模型准确率有何影响?
  • RQ5LDWPSO能否在不同复杂度的基准数据集上实现一致的性能提升?

主要发现

  • 在MNIST数据集上,LDWPSO-CNN在第5轮训练时达到98.95%的top-1准确率,而基线CNN仅为94.02%。
  • 在CIFAR-10数据集上,LDWPSO-CNN在第10轮训练时达到69.37%的top-1准确率,显著优于基线CNN的28.07%。
  • LDWPSO方法在收敛速度与准确率方面均优于使用固定超参数的标准CNN训练。
  • 线性递减惯性权重改善了探索与开发之间的平衡,从而提升了超参数搜索性能。
  • 结果证实,LDWPSO在CNN的自动化超参数优化中具有有效性,尤其在CIFAR-10等复杂数据集上表现突出。
  • 所提出方法在不改变基础CNN模型架构的前提下,实现了显著的准确率提升。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。