QUICK REVIEW

[论文解读] An Architecture Combining Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for Image Classification

Abien Fred Agarap|arXiv (Cornell University)|Dec 10, 2017

Neural Networks and Applications参考文献 11被引用 143

一句话总结

该论文实现了用于图像分类的 CNN-SVM 架构，并将其与 MNIST 和 Fashion-MNIST 上的 CNN-Softmax 进行比较，报告的准确率相似或略有不同。

ABSTRACT

Convolutional neural networks (CNNs) are similar to "ordinary" neural networks in the sense that they are made up of hidden layers consisting of neurons with "learnable" parameters. These neurons receive inputs, performs a dot product, and then follows it with a non-linearity. The whole network expresses the mapping between raw image pixels and their class scores. Conventionally, the Softmax function is the classifier used at the last layer of this network. However, there have been studies (Alalshekmubarak and Smith, 2013; Agarap, 2017; Tang, 2013) conducted to challenge this norm. The cited studies introduce the usage of linear support vector machine (SVM) in an artificial neural network architecture. This project is yet another take on the subject, and is inspired by (Tang, 2013). Empirical data has shown that the CNN-SVM model was able to achieve a test accuracy of ~99.04% using the MNIST dataset (LeCun, Cortes, and Burges, 2010). On the other hand, the CNN-Softmax was able to achieve a test accuracy of ~99.23% using the same dataset. Both models were also tested on the recently-published Fashion-MNIST dataset (Xiao, Rasul, and Vollgraf, 2017), which is suppose to be a more difficult image classification dataset than MNIST (Zalandoresearch, 2017). This proved to be the case as CNN-SVM reached a test accuracy of ~90.72%, while the CNN-Softmax reached a test accuracy of ~91.86%. The said results may be improved if data preprocessing techniques were employed on the datasets, and if the base CNN model was a relatively more sophisticated than the one used in this study.

研究动机与目标

推动在 CNN 架构中将 SVM 作为替代分类器的探索。
在标准基准 MNIST 和 Fashion-MNIST 上评估 CNN-SVM 相对于 CNN-Softmax 的性能。
在不进行数据预处理的情况下分析训练动态和最终测试准确率。
讨论相对于先前工作（Tang, 2013）的含义与局限性。

提出的方法

使用一个基线 CNN，包含两个卷积层，后接全连接层和 dropout。
在最后一层用 L2-SVM 损失替代最终的 softmax 分类器，并使用 Adam 优化进行训练。
在 MNIST 和 Fashion-MNIST 上比较两种设置：CNN-Softmax 与 CNN-SVM，且不进行显式预处理。
在 10,000 步后报告训练准确率、训练损失和测试准确率。
超参数：批量大小 128， dropout 0.5，学习率 1e-3，CNN-SVM 的 SVM C = 1。
代码可在： https://github.com/AFAgarap/cnn-svm

实验结果

研究问题

RQ1在 MNIST 与 Fashion-MNIST 上，CNN-SVM 的测试准确率是否可与 CNN-Softmax 相当或更好？
RQ2去除数据预处理对这些数据集上 CNN-SVM 的性能有何影响？
RQ3在相同架构和训练设定下，CNN-SVM 与 CNN-Softmax 的训练动态（准确率和损失）是什么？
RQ4结果是否与先前工作所暗示的 CNN-SVM 可能与基于 softmax 的分类器具有竞争力的结论一致？

主要发现

数据集	CNN-Softmax	CNN-SVM
MNIST	99.23%	99.04%
Fashion-MNIST	91.86%	90.72%

在 MNIST 上，CNN-Softmax 在测试准确率上略高于 CNN-SVM（99.23% 对 99.04%）。
在 Fashion-MNIST 上，CNN-Softmax 也优于 CNN-SVM（91.86% 对 90.72%）。
这两种模型在 MNIST 和 Fashion-MNIST 上的训练时间约为每次运行 4 分钟（基于训练步数的近似）。
在未进行数据预处理的测试数据集上，CNN-SVM 实现了有竞争力的结果，但未超越 CNN-Softmax。
研究指出，更复杂的基线 CNN 模型和预处理可能再现或改进相关工作（Tang, 2013）中报道的结果。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。