Skip to main content
QUICK REVIEW

[Paper Review] Wav-KAN: Wavelet Kolmogorov-Arnold Networks

Zavareh Bozorgasl, Hao Chen|arXiv (Cornell University)|May 21, 2024
Neural Networks and Applications7 citations
TL;DR

Wav-KAN integrates wavelet transforms into Kolmogorov-Arnold Networks to improve interpretability, efficiency, and robustness, outperforming Spl-KAN and MLPs in some tasks.

ABSTRACT

In this paper, we introduce Wav-KAN, an innovative neural network architecture that leverages the Wavelet Kolmogorov-Arnold Networks (Wav-KAN) framework to enhance interpretability and performance. Traditional multilayer perceptrons (MLPs) and even recent advancements like Spl-KAN face challenges related to interpretability, training speed, robustness, computational efficiency, and performance. Wav-KAN addresses these limitations by incorporating wavelet functions into the Kolmogorov-Arnold network structure, enabling the network to capture both high-frequency and low-frequency components of the input data efficiently. Wavelet-based approximations employ orthogonal or semi-orthogonal basis and maintain a balance between accurately representing the underlying data structure and avoiding overfitting to the noise. While continuous wavelet transform (CWT) has a lot of potentials, we also employed discrete wavelet transform (DWT) for multiresolution analysis, which obviated the need for recalculation of the previous steps in finding the details. Analogous to how water conforms to the shape of its container, Wav-KAN adapts to the data structure, resulting in enhanced accuracy, faster training speeds, and increased robustness compared to Spl-KAN and MLPs. Our results highlight the potential of Wav-KAN as a powerful tool for developing interpretable and high-performance neural networks, with applications spanning various fields. This work sets the stage for further exploration and implementation of Wav-KAN in frameworks such as PyTorch and TensorFlow, aiming to make wavelets in KAN as widespread as activation functions like ReLU and sigmoid in universal approximation theory (UAT). The codes to replicate the simulations are available at https://github.com/zavareh1/Wav-KAN.

Motivation & Objective

  • Motivate interpretable neural networks and address limitations of MLPs and Spl-KAN in interpretability, training speed, and robustness.
  • Introduce a wavelet-based extension of KAN (Wav-KAN) to capture both high- and low-frequency data components.
  • Propose a multi-layer KAN architecture with wavelet activation to improve efficiency and accuracy.
  • Demonstrate the approach on MNIST and discuss advantages over traditional activations and spline-based KAN.

Proposed method

  • Replace traditional weights with learnable univariate wavelet functions on edges between layers.
  • Use continuous and discrete wavelet transforms as activations within the KAN framework.
  • Employ a multi-layer KAN structure with matrix-like aggregation via the operator T_o to sum activated outputs.
  • Incorporate batch normalization to enhance training speed and accuracy.
  • Compare Wav-KAN to Spl-KAN and MLPs in terms of parameters, speed, and robustness.
  • Demonstrate that wavelet choice (Mexican hat, Morlet, DOG, Shannon) affects performance on MNIST.
Figure 1: Wav-KAN with arbitrary number of layers (here is Wav-KAN[2,3,2])
Figure 1: Wav-KAN with arbitrary number of layers (here is Wav-KAN[2,3,2])

Experimental results

Research questions

  • RQ1Can Wav-KAN achieve higher accuracy than Spl-KAN and MLPs on image classification tasks?
  • RQ2Does integrating wavelets into KAN improve training speed and robustness while maintaining interpretability?
  • RQ3How do different mother wavelets impact Wav-KAN performance on MNIST?
  • RQ4What is the parameter efficiency of Wav-KAN relative to Spl-KAN and MLPs across layers?
  • RQ5Can Wav-KAN be integrated into mainstream frameworks like PyTorch or TensorFlow effectively?

Key findings

  • Wav-KAN yields faster training and improved accuracy compared to Spl-KAN on MNIST under tested configurations.
  • Wavelet-based activations enable efficient representation of both local details and global structure.
  • Batch normalization further improves accuracy and training speed for Wav-KAN and Spl-KAN.
  • Different mother wavelets significantly influence performance, with Shannon and some wavelets underperforming in certain setups.
  • Wav-KAN uses fewer parameters than Spl-KAN for comparable tasks due to wavelet capacity (weight, translation, scaling).
  • The approach is positioned as more interpretable and scalable within the KAN family.
Figure 2: Training accuracy of Wav-KAN [28*28,32,10] versus Spl-KAN [28*28,32,10]
Figure 2: Training accuracy of Wav-KAN [28*28,32,10] versus Spl-KAN [28*28,32,10]

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.