QUICK REVIEW

[論文レビュー] Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies

Ivan Drokin|arXiv (Cornell University)|Jul 1, 2024

Computability, Logic, AI Algorithms被引用数 19

ひとこと要約

この論文は Kolmogorov-Arnold Networks を畳み込み層に適用した KanConv を提案し、ボトルネック設計とパラメータ効率のファインチューニングを提示して、画像分類で競争力のある性能を示し、生物医学的セグメンテーションでは最先端の結果を示す。

ABSTRACT

The emergence of Kolmogorov-Arnold Networks (KANs) has sparked significant interest and debate within the scientific community. This paper explores the application of KANs in the domain of computer vision (CV). We examine the convolutional version of KANs, considering various nonlinearity options beyond splines, such as Wavelet transforms and a range of polynomials. We propose a parameter-efficient design for Kolmogorov-Arnold convolutional layers and a parameter-efficient finetuning algorithm for pre-trained KAN models, as well as KAN convolutional versions of self-attention and focal modulation layers. We provide empirical evaluations conducted on MNIST, CIFAR10, CIFAR100, Tiny ImageNet, ImageNet1k, and HAM10000 datasets for image classification tasks. Additionally, we explore segmentation tasks, proposing U-Net-like architectures with KAN convolutions, and achieving state-of-the-art results on BUSI, GlaS, and CVC datasets. We summarized all of our findings in a preliminary design guide of KAN convolutional models for computer vision tasks. Furthermore, we investigate regularization techniques for KANs. All experimental code and implementations of convolutional layers and models, pre-trained on ImageNet1k weights are available on GitHub via this https://github.com/IvanDrokin/torch-conv-kan

研究の動機と目的

コンピュータビジョンにおける標準的なCNNの効率的な代替として、畳み込み Kolmogorov-Arnold Networks (KANs) を動機づけ、研究する。
パラメータを削減しつつ表現力を保つ Bottleneck Kolmogorov-Arnold Convolution (KAGN) 設計を提案する。
Gram 多項式バリアントのKANのためのパラメータ効率のファインチューニングを開発する。
CNN内でKANsを自己注意と焦点モジュレーションのフレームワークに拡張する。
KANベースのモデルをCVで構築する際の経験的ガイドラインと設計ガイドを提供する。

提案手法

Kolmogorov-Arnold Convolution (KANConv) を univariate nonlinear bases（スプリング、RBF、ウェーブレット、ポリノミアル）で形式化する。
Gram 多項式ベースの基底と Gram KANs のためのパラメータ効率のファインチューニングスキームを導入する。
squeezing/expanding 1x1畳み込みと混合専門家ルーティングを持つ Bottleneck Kolmogorov-Arnold Convolutions を提案する。
Self-Attention および Focal Modulation レイヤーを Bottleneck KANConvs で置換して Self-KAGN および Focal KAGN Modulation を形成する。
KANに対する重み・活性化ペナルティ、ドロップアウト配置、加法ノイズを含む正則化戦略を検討する。
Gram KANs のための PEFT アルゴリズムを提供し、次数ごとに Gram係数を段階的に調整する。

Figure 1: KAN Convolution (left) and Bottleneck KAN Convolution (right). The main difference between these two types of layers is a encoder-decoder convolutional layers on the right data stream.

実験結果

リサーチクエスチョン

RQ1標準CVベンチマークで従来のCNNおよび他のKANバリアントと比較して、畳み込み Kolmogorov-Arnold ネットワークはどの程度性能を発揮するか？
RQ2ボトルネック KANConvolution はパラメータ数を大幅に削減しつつ精度を維持できるか？
RQ3Gram-多項式ベースのKANは少数の訓練可能パラメータで効果的なファインチューニングを支援できるか？
RQ4Self-attention および focal modulation 構造に用いたとき、ボトルネックKAN層は性能を向上させるか？
RQ5KANベースモデルの安定化と一般化を最も効果的にする正則化およびハイパーパラメータ戦略は何か？
RQ6ボトルネック KANConvs から有効なCVモデルを構築する設計原理は何か？

主な発見

Model	MNIST Val.Acc	MNIST Params_M	MNIST Time_s	CIFAR10 Val.Acc	CIFAR10 Params_M	CIFAR10 Time_s	CIFAR100 Val.Acc	CIFAR100 Params_M	CIFAR100 Time_s
Conv, 4 layers, baseline	99.42	0.1	0.7008	0.7008	73.18	0.1	1.8321	42.29	0.12	1.5994
KANConv, 4 layers	99.00	3.49	2.6401	99.00	52.08	3.49	3.7972	21.78	3.52	4.0262
FastKANConv, 4 layers	97.65	3.49	1.5999	97.65	64.95	3.49	2.3716	34.32	3.52	2.7457
KALNConv, 4 layers	84.85	1.94	1.7205	84.85	10.28	1.94	3.0527	5.97	1.97	3.0919
KACNConv, 4 layers	97.62	3.92	1.6710	97.62	52.01	3.92	2.3972	23.17	0.42	2.6522
KAGNConv, 4 layers	99.49	0.49	1.7253	99.49	65.84	0.49	2.2570	47.36	1.97	2.3399
WavKANConv, 4 layers	99.23	0.95	7.4622	99.23	73.63	0.95	11.2276	41.50	0.98	11.4744
Conv, 8 layers, baseline	99.63	1.14	1.2061	99.63	83.05	1.14	1.8258	57.52	1.19	1.8265
KANConv, 8 layers	99.37	40.7	4.2011	99.37	74.66	40.7	5.4858	36.18	40.74	5.7067
FastKANConv, 8 layers	99.49	40.7	2.1653	99.49	74.66	40.7	5.4858	43.32	40.74	2.7771
KALNConv, 8 layers	49.97	22.61	1.7815	49.97	15.97	22.61	2.7348	1.74	22.65	2.6863
KACNConv, 8 layers	99.32	18.09	1.6973	99.32	62.14	18.09	2.3459	25.01	18.14	2.3826
KAGNConv, 8 layers	99.68	22.61	2.2402	99.68	84.14	22.61	2.5849	59.27	22.66	2.6460
WavKANConv, 8 layers	99.57	10.73	59.1734	99.57	85.37	10.73	28.0385	55.43	10.78	30.5438

Gram 多項式およびウェーブレットベースのKANConvは、MNIST、CIFAR10、CIFAR100 のいくつかの構成で素のCNNを上回ることがあり、Gram KANConv はしばしば精度/パラメータのトレードオフが好ましい。
ボトルネック Kolmogorov-Arnold Convolutions はトレーニング可能なパラメータを大幅に削減しつつ性能を維持する；深さのスケーリングより幅のスケーリングの方が多くの設定で有益。
Gram KANs のパラメータ効率のファインチューニングにより、新タスクへ適応する際にネットワークの大部分を再訓練する必要性が減る。
標準畳み込みを Bottleneck KAGN 層に置換した U-Net 系のアーキテクチャは、生物医学的セグメンテーションデータセット（BUSI、GlaS、CVC）で最先端の結果をもたらす。
自己KAGNtention および Focal KAGN Modulation バリアントは Bottleneck KAN層から構築され、分類性能を向上させる可能性がある。
経験的な知見に基づく設計ガイドを提案し、Bottleneck KANConvs を用いたCVモデルの構築を指針とする。

Figure 2: Bottleneck Kolmogorov-Arnold Convolutional Mixture of Experts. The router and experts are placed between bottleneck convolutions, and each expert is a $\tilde{\varphi}$ set of univariate functions. We use sparsely-gated mixture-of-experts [ 15 ] .

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。