QUICK REVIEW

[论文解读] Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies

Ivan Drokin|arXiv (Cornell University)|Jul 1, 2024

Computability, Logic, AI Algorithms被引用 19

一句话总结

该论文将 Kolmogorov-Arnold Networks（KAN）适用于卷积层（KANConv），提出 Bottleneck 设计和参数高效微调，并在图像分类上表现具有竞争力，在生物医学分割方面达到最先进水平。

ABSTRACT

The emergence of Kolmogorov-Arnold Networks (KANs) has sparked significant interest and debate within the scientific community. This paper explores the application of KANs in the domain of computer vision (CV). We examine the convolutional version of KANs, considering various nonlinearity options beyond splines, such as Wavelet transforms and a range of polynomials. We propose a parameter-efficient design for Kolmogorov-Arnold convolutional layers and a parameter-efficient finetuning algorithm for pre-trained KAN models, as well as KAN convolutional versions of self-attention and focal modulation layers. We provide empirical evaluations conducted on MNIST, CIFAR10, CIFAR100, Tiny ImageNet, ImageNet1k, and HAM10000 datasets for image classification tasks. Additionally, we explore segmentation tasks, proposing U-Net-like architectures with KAN convolutions, and achieving state-of-the-art results on BUSI, GlaS, and CVC datasets. We summarized all of our findings in a preliminary design guide of KAN convolutional models for computer vision tasks. Furthermore, we investigate regularization techniques for KANs. All experimental code and implementations of convolutional layers and models, pre-trained on ImageNet1k weights are available on GitHub via this https://github.com/IvanDrokin/torch-conv-kan

研究动机与目标

动机并研究卷积型 Kolmogorov-Arnold Networks（KANs）作为计算机视觉中对标准卷积神经网络的高效替代方案。
提出带瓶颈的 Kolmogorov-Arnold卷积（KAGN）设计，在减小参数的同时保持表达能力。
为 Gram 多项式变体的 KAN 发展参数高效的微调。
将 KANs 扩展到卷积神经网络中的自注意力和焦点调制框架。
提供基于实证发现的设计指南，用于在计算机视觉中构建基于 KAN 的模型。

提出的方法

将 Kolmogorov-Arnold 卷积（KANConv）形式化为一元非线性基（样条、RBF、小波、多项式）。
引入基于 Gram 多项式的基底及用于 Gram KAN 的参数高效微调方案（PEFT）。
提出带挤压/扩展 1x1 卷积和专家混合路由的 Bottleneck Kolmogorov-Arnold Convolutions。
用 Bottleneck KANConvs 替换 Self-Attention 和 Focal Modulation 层，从而形成 Self-KAGN 与 Focal KAGN Modulation。
检验正则化策略，包括权重/激活惩罚、dropout 放置位置以及对 KAN 的加性噪声。
提供一个用于 Gram KAN 的 PEFT 算法，通过多项式次数逐步调谐 Gram 系数。

Figure 1: KAN Convolution (left) and Bottleneck KAN Convolution (right). The main difference between these two types of layers is a encoder-decoder convolutional layers on the right data stream.

实验结果

研究问题

RQ1卷积型 Kolmogorov-Arnold Networks 在标准 CV 基准测试上的表现相较于传统 CNNs 与其他 KAN 变体如何？
RQ2带瓶颈的 KANConvolution 是否在显著降低参数量的同时保持准确性？
RQ3基于 Gram 多项式的 KANs 是否能够在很少可训练参数的条件下实现有效微调？
RQ4在自注意力和焦点调制结构中使用带瓶颈的 KAN 层是否能改善性能？
RQ5哪些正则化和超参数策略最能稳定并泛化基于 KAN 的模型？
RQ6从 Bottleneck KANConvs 构建有效的 CV 模型时会出现哪些设计原则？

主要发现

模型	MNIST 验证准确率	MNIST 参数_M	MNIST 时间_s	CIFAR10 验证准确率	CIFAR10 参数_M	CIFAR10 时间_s	CIFAR100 验证准确率	CIFAR100 参数_M	CIFAR100 时间_s
Conv, 4 layers, baseline	99.42	0.1	0.7008	0.7008	73.18	0.1	1.8321	42.29	0.12	1.5994
KANConv, 4 layers	99.00	3.49	2.6401	99.00	52.08	3.49	3.7972	21.78	3.52	4.0262
FastKANConv, 4 layers	97.65	3.49	1.5999	97.65	64.95	3.49	2.3716	34.32	3.52	2.7457
KALNConv, 4 layers	84.85	1.94	1.7205	84.85	10.28	1.94	3.0527	5.97	1.97	3.0919
KACNConv, 4 layers	97.62	3.92	1.6710	97.62	52.01	3.92	2.3972	23.17	0.42	2.6522
KAGNConv, 4 layers	99.49	0.49	1.7253	99.49	65.84	0.49	2.2570	47.36	1.97	2.3399
WavKANConv, 4 layers	99.23	0.95	7.4622	99.23	73.63	0.95	11.2276	41.50	0.98	11.4744
Conv, 8 layers, baseline	99.63	1.14	1.2061	99.63	83.05	1.14	1.8258	57.52	1.19	1.8265
KANConv, 8 layers	99.37	40.7	4.2011	99.37	74.66	40.7	5.4858	36.18	40.74	5.7067
FastKANConv, 8 layers	99.49	40.7	2.1653	99.49	74.66	40.7	5.4858	43.32	40.74	2.7771
KALNConv, 8 layers	49.97	22.61	1.7815	49.97	15.97	22.61	2.7348	1.74	22.65	2.6863
KACNConv, 8 layers	99.32	18.09	1.6973	99.32	62.14	18.09	2.3459	25.01	18.14	2.3826
KAGNConv, 8 layers	99.68	22.61	2.2402	99.68	84.14	22.61	2.5849	59.27	22.66	2.6460
WavKANConv, 8 layers	99.57	10.73	59.1734	99.57	85.37	10.73	28.0385	55.43	10.78	30.5438

Gram 多项式和小波基的 KANConvs 在 MNIST、CIFAR10 和 CIFAR100 的多种配置下优于 Vanilla CNN；Gram KANConv 通常提供更有利的准确率/参数权衡。
带瓶颈的 Kolmogorov-Arnold 卷积在显著降低可训练参数的同时保持性能；在许多设置中，深度扩展不如宽度扩展受益大。
Gram KAN 的参数高效微调在适应新任务时减少了重新训练网络大部分的需求。
在 U-Net 样架构中用 Bottleneck KAGN 层替换标准卷积，在生物医学分割数据集（BUSI, GlaS, CVC）上取得了最先进的结果。
基于 Bottleneck KAN 层构建的 Self-KAGNtention 与 Focal KAGN Modulation 变体可以提升分类性能。
基于实证发现提出了一份用于构建带 Bottleneck KANConvs 的 CV 模型的设计指南。

Figure 2: Bottleneck Kolmogorov-Arnold Convolutional Mixture of Experts. The router and experts are placed between bottleneck convolutions, and each expert is a $\tilde{\varphi}$ set of univariate functions. We use sparsely-gated mixture-of-experts [ 15 ] .

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。