Papers
Topics
Authors
Recent
Search
2000 character limit reached

FC-KAN: Function Combinations in Kolmogorov-Arnold Networks

Published 3 Sep 2024 in cs.LG and cs.CL | (2409.01763v3)

Abstract: In this paper, we introduce FC-KAN, a Kolmogorov-Arnold Network (KAN) that leverages combinations of popular mathematical functions such as B-splines, wavelets, and radial basis functions on low-dimensional data through element-wise operations. We explore several methods for combining the outputs of these functions, including sum, element-wise product, the addition of sum and element-wise product, representations of quadratic and cubic functions, concatenation, linear transformation of the concatenated output, and others. In our experiments, we compare FC-KAN with a multi-layer perceptron network (MLP) and other existing KANs, such as BSRBF-KAN, EfficientKAN, FastKAN, and FasterKAN, on the MNIST and Fashion-MNIST datasets. Two variants of FC-KAN, which use a combination of outputs from B-splines and Difference of Gaussians (DoG) and from B-splines and linear transformations in the form of a quadratic function, outperformed overall other models on the average of 5 independent training runs. We expect that FC-KAN can leverage function combinations to design future KANs. Our repository is publicly available at: https://github.com/hoangthangta/FC_KAN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Kan: Kolmogorov-arnold networks. arXiv preprint arXiv:2404.19756, 2024a.
  2. Kan 2.0: Kolmogorov-arnold networks meet science. arXiv preprint arXiv:2408.10205, 2024b.
  3. Andrei Nikolaevich Kolmogorov. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. In Doklady Akademii Nauk, volume 114, pages 953–956. Russian Academy of Sciences, 1957.
  4. Ziyao Li. Kolmogorov-arnold networks are radial basis function networks. arXiv preprint arXiv:2405.06721, 2024.
  5. Athanasios Delis. Fasterkan. https://github.com/AthanasiosDelis/faster-kan/, 2024.
  6. Wav-kan: Wavelet kolmogorov-arnold networks. arXiv preprint arXiv:2405.12832, 2024.
  7. Sidharth SS. Chebyshev polynomial-based kolmogorov-arnold networks: An efficient architecture for nonlinear function approximation. arXiv preprint arXiv:2405.07200, 2024.
  8. Hoang-Thang Ta. Bsrbf-kan: A combination of b-splines and radial basis functions in kolmogorov-arnold networks. arXiv preprint arXiv:2406.11173, 2024.
  9. Activation space selectable kolmogorov-arnold networks. arXiv preprint arXiv:2408.08338, 2024.
  10. On a constructive proof of kolmogorov’s superposition theorem. Constructive approximation, 30:653–675, 2009.
  11. Treedrnet: a robust deep model for long term time series forecasting. arXiv preprint arXiv:2206.12106, 2022.
  12. Space-filling curves and kolmogorov superposition-based neural networks. Neural Networks, 15(1):57–67, 2002.
  13. Mario Köppen. On the training of a kolmogorov network. In Artificial Neural Networks—ICANN 2002: International Conference Madrid, Spain, August 28–30, 2002 Proceedings 12, pages 474–479. Springer, 2002.
  14. On the realization of a kolmogorov network. Neural Computation, 5(1):18–20, 1993.
  15. The kolmogorov spline network for image processing. In Image Processing: Concepts, Methodologies, Tools, and Applications, pages 54–78. IGI Global, 2013.
  16. The kolmogorov superposition theorem can break the curse of dimensionality when approximating high dimensional functions. arXiv preprint arXiv:2112.09963, 2021.
  17. Representation properties of networks: Kolmogorov’s theorem is irrelevant. Neural Computation, 1(4):465–469, 1989.
  18. AG Vitushkin. On hilbert’s thirteenth problem. In Dokl. Akad. Nauk SSSR, volume 95, pages 701–704, 1954.
  19. Věra Kůrková. Kolmogorov’s theorem is relevant. Neural computation, 3(4):617–622, 1991.
  20. Vikas Dhiman. Kan: Kolmogorov–arnold networks: A review. https://vikasdhiman.info/reviews/KAN_a_review.pdf, 2024.
  21. Carl De Boor. On calculating with b-splines. Journal of Approximation theory, 6(1):50–62, 1972.
  22. Deepokan: Deep operator network based on kolmogorov arnold networks for mechanics problems. arXiv preprint arXiv:2405.19143, 2024.
  23. Subhransu S. Bhattacharjee. Torchkan: Simplified kan model with variations. https://github.com/1ssb/torchkan/, 2024.
  24. Fourierkan-gcf: Fourier kolmogorov-arnold network–an effective and efficient feature transformation for graph collaborative filtering. arXiv preprint arXiv:2406.01034, 2024.
  25. Seyd Teymoor Seydi. Unveiling the power of wavelets: A wavelet-based kolmogorov-arnold network for hyperspectral image classification. arXiv preprint arXiv:2406.07869, 2024.
  26. Seyd Teymoor Seydi. Exploring the potential of polynomial basis functions in kolmogorov-arnold networks: A comparative study of different groups of polynomials. arXiv e-prints, pages arXiv–2406, 2024.
  27. Andrei Vladimirovich Chernov. Gaussian functions combined with kolmogorov’s theorem as applied to approximation of functions of several variables. Computational Mathematics and Mathematical Physics, 60:766–782, 2020.
  28. Johannes Schmidt-Hieber. The kolmogorov–arnold representation theorem revisited. Neural networks, 137:119–126, 2021.
  29. Li Deng. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE signal processing magazine, 29(6):141–142, 2012.
  30. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  31. Kan or mlp: A fairer comparison. arXiv preprint arXiv:2407.16674, 2024.
Citations (4)

Summary

  • The paper demonstrates that combining mathematical functions in Kolmogorov-Arnold Networks significantly improves performance on low-dimensional data.
  • It details a novel architecture employing element-wise summation, concatenation, and quadratic operations to optimize data representation.
  • Benchmarking on MNIST and Fashion-MNIST shows that FC-KAN outperforms conventional KANs and MLPs through diverse function combinations.

FC-KAN: Function Combinations in Kolmogorov-Arnold Networks

Introduction

FC-KAN proposes leveraging function combinations within the Kolmogorov-Arnold Networks (KANs) framework to enhance performance in low-dimensional data applications. By implementing neural network architectures that integrate B-splines, wavelets, and radial basis functions through element-wise operations, FC-KAN differentiates itself from conventional techniques that rely on single-function usage. This approach amplifies KAN's effectiveness, as demonstrated through comparative evaluations against standard models like MLPs and various KAN variants on popular datasets such as MNIST and Fashion-MNIST.

Kolmogorov-Arnold Representation in Neural Networks

Kolmogorov's representation theorem underpins KAN's architecture, enabling any multivariable continuous function to be expressed through univariate function aggregation. KANs extend this theorem by substituting learnable function matrices for fixed activation functions in MLPs, resulting in architectures optimized for function combinations in the input domain. The general theory provides the scaffolding for designing deeper and wider networks, thereby aligning network capabilities with complex problem spaces. Figure 1

Figure 1: Left: The structure of KAN(2,3,1). Right: The simulation of how to calculate ϕ1,1,1\phi_{1,1,1}.

Function Combinations in FC-KAN

FC-KAN's core contribution is the synthesis of individual low-dimensional data outputs using combinations of mathematical function outputs. This framework employs both standard mathematical functions and innovative constructs such as B-splines and wavelets in the form of Difference of Gaussians (DoG). The architecture of FC-KAN includes operational mechanisms for element-wise summation, concatenation, and linearization at the output, which optimizes data representation based on task-specific requirements. Figure 2

Figure 2: The structure of FC-KAN and the three types of combined outputs: element-wise, concatenation, and linearization.

Experimental Evaluation

In comprehensive benchmarking, FC-KAN demonstrated superior capability by outperforming both traditional KANs and MLPs. Comparisons conducted on the MNIST and Fashion-MNIST datasets underline the model's efficacy, with FC-KAN leveraging combined function outputs to yield higher validation accuracies. For instance, when combining DoG and B-splines with a quadratic function at the output, FC-KAN achieved notable accuracy gains. Figure 3

Figure 3: The logarithmic values of training losses for the models over 25 epochs on MNIST and 35 epochs on Fashion-MNIST.

Combination Methodologies

The study investigated several output combination techniques, including element-wise summation, product operations, and quadratic functions. Element-wise combinations exemplified superior performance, indicating that they encapsulate more data features compared to alternatives. Quadratic function representation consistently resulted in the highest validation accuracies, albeit with increased computational demands. The same principle holds true when employing cubic functions, although gains in accuracy plateaued relative to quadratic mechanisms. Figure 4

Figure 4: Various data combinations are performed using element-wise operations (additions ++ and multiplications ⊙\odot) over two given outputs.

Empirical Insights and Performance Implications

Empirical analysis suggests that FC-KAN requires strategic selection of function combinations to maximize performance gains. Basic linear combinations serve as a baseline, whereas more complex operations like addition with products or higher-degree models provide nuanced improvements. Nevertheless, the scalability of these operations is inherently dependent on data dimensionality, necessitating computational trade-offs. Figure 5

Figure 5: The validation accuracy values of the models across various data subsets.

Conclusion

FC-KAN successfully advances the application of function combinations in KAN architectures, achieving significant performance improvements across standard image classification tasks. The research emphasizes the need for targeted combination strategies to harness KAN's theoretical properties effectively. Future research directions should consider dynamic adaptation techniques for function selection and combinations to further optimize neural architecture designs. By doing so, practical implementation of function combinations in neural network models can be further refined for broader applications.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.