Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PAON: A New Neuron Model using Padé Approximants (2403.11791v1)

Published 18 Mar 2024 in eess.IV and cs.CV

Abstract: Convolutional neural networks (CNN) are built upon the classical McCulloch-Pitts neuron model, which is essentially a linear model, where the nonlinearity is provided by a separate activation function. Several researchers have proposed enhanced neuron models, including quadratic neurons, generalized operational neurons, generative neurons, and super neurons, with stronger nonlinearity than that provided by the pointwise activation function. There has also been a proposal to use Pade approximation as a generalized activation function. In this paper, we introduce a brand new neuron model called Pade neurons (Paons), inspired by the Pade approximants, which is the best mathematical approximation of a transcendental function as a ratio of polynomials with different orders. We show that Paons are a super set of all other proposed neuron models. Hence, the basic neuron in any known CNN model can be replaced by Paons. In this paper, we extend the well-known ResNet to PadeNet (built by Paons) to demonstrate the concept. Our experiments on the single-image super-resolution task show that PadeNets can obtain better results than competing architectures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. “Imagenet classification with deep convolutional neural networks,” Advances in Neural Info. Proc. Systems, vol. 25, 2012.
  2. W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” The bulletin of Mathematical Biophysics, vol. 5, pp. 115–133, 1943.
  3. Frank Rosenblatt, The Perceptron, a Perceiving and Recognizing Automaton Project Para, Report: Cornell Aeronautical Laboratory. 1957.
  4. “What is the best multi-stage architecture for object recognition?,” in IEEE Int. Conf. on Comp. Vis. (ICCV), 2009, pp. 2146–2153.
  5. “Rectifier nonlinearities improve neural network acoustic models,” in Int. Conf. Mach. Learn. (ICML). Atlanta, GA, 2013, vol. 30, p. 3.
  6. D. Hendrycks and K. Gimpel, “Gaussian error linear units (GELUs),” preprint arXiv:1606.08415, 2016.
  7. “Sigmoid-weighted linear units for neural network function approximation in reinforcement learning,” Neural networks, vol. 107, pp. 3–11, 2018.
  8. “Padé activation units: End-to-end learning of flexible activation functions in deep networks,” in Int. Conf on Learning Repr. (ICLR), 2019.
  9. “Rotational quadratic function neural networks,” in IEEE Int. Joint Conf. on Neural Networks, 1991, pp. 869–874.
  10. “Annealing based dynamic learning in second-order neural networks,” in Proceedings of International Conference on Neural Networks (ICNN’96). IEEE, 1996, vol. 1, pp. 458–463.
  11. “Quadralib: A performant quadratic neural network library for architecture optimization and design exploration,” Proc. of Machine Learning and Systems, vol. 4, pp. 503–514, 2022.
  12. “Expressivity enhancement with efficient quadratic neurons for convolutional neural networks,” arXiv preprint arXiv:2306.07294, 2023.
  13. “Operational neural networks,” Neural Computing and Applications, vol. 32, pp. 6645–6668, 2020.
  14. “Self-organized operational neural networks with generative neurons,” Neural Networks, vol. 140, pp. 294–308, 2021.
  15. “Self-organized residual blocks for image super-resolution,” in IEEE Int. Conf. on Image Processing (ICIP), 2021, pp. 589–593.
  16. “Self-organized variational autoencoders (self-vae) for learned image compression,” in IEEE Int. Conf. on Image Processing (ICIP), 2021, pp. 3732–3736.
  17. “Super neurons,” IEEE Trans. on Emerging Topics in Comp. Intel., 2023.
  18. “Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes,” in SIAM Int. Conf. on Data Mining (SDM). SIAM, 2021, pp. 675–683.
  19. G. A. Baker and P. Graves-Morris, “Padé approximants,” 1996.
  20. B Beckermann and V. Ka.liaguine, “The diagonal of the padé table and the approximation of the weyl function of second-order difference operators,” Constructive approximation, vol. 13, pp. 481–510, 1997.
  21. “Photo-realistic single image super-resolution using a generative adversarial network,” in IEEE/CVF Conf. on Comp. vis. and Patt. Recog. (CVPR), 2017, pp. 4681–4690.
  22. “Deep residual learning for image recognition,” in IEEE/CVF Conf. on Comp. Vis. and Patt. Recog. (CVPR), 2016, pp. 770–778.
  23. S Zagoruyko and N. .Komodakis, “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
  24. “Inception-v4, inception-resnet and the impact of residual connections on learning,” in AAAI Conf. on artificial intelligence, 2017, vol. 31.
  25. “Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network,” in IEEE/CVF Conf. on Comp. Vis. and Patt. Recog. (CVPR), 2016, pp. 1874–1883.
  26. “Boundary value problem in image restoration,” in ICASSP ’85. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1985, vol. 10, pp. 692–695.
  27. “Enhanced deep residual networks for single image super-resolution,” in IEEE/CVF Conf. on Comp. Vis. and Patt. Recog. (CVPR) Workshops, 2017, pp. 136–144.
  28. E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image super-resolution: Dataset and study,” in IEEE/CVF Conf. on Comp. Vis. and Patt. Recog. (CVPR) Workshops, July 2017.
  29. “Ntire 2017 challenge on single image super-resolution: Methods and results,” in IEEE/CVF Conf. on Comp. Vis. and Patt. Recog. (CVPR) Workshops, July 2017.
  30. J. T Barron, “A general and adaptive robust loss function,” in IEEE/CVF Conf. on Comp. Vis. and Patt. Recog. (CVPR), 2019, pp. 4331–4339.
  31. “Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models,” arXiv preprint arXiv:2208.06677, 2022.
  32. “Sgdr: Stochastic gradient descent with warm restarts,” arXiv preprint arXiv:1608.03983, 2016.
  33. “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in IEEE Int. Conf. on Comp. Vis. (ICCV), 2001, vol. 2, pp. 416–423.
  34. “Sketch-based manga retrieval using manga109 dataset,” Multimedia Tools and Applications, vol. 76, pp. 21811–21838, 2017.
  35. “Low-complexity single-image super-resolution based on nonnegative neighbor embedding,” in British Machine Vision Conference (BMVC), 2012.
  36. “On single image scale-up using sparse-representations,” in Int. Conf. on Curves and Surfaces, Avignon, France, June 24-30, 2010, Revised Selected Papers 7. Springer, 2012, pp. 711–730.
  37. “Single image super-resolution from transformed self-exemplars,” in IEEE/CVF Conf. Comp.Vis. Patt.Recog.(CVPR), 2015, pp. 5197–5206.
  38. “On the computation of psnr for a set of images or video,” in Picture Coding Symp. (PCS), 2021, pp. 1–5.
  39. “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
  40. “The unreasonable effectiveness of deep features as a perceptual metric,” in IEEE/CVF Conf. on Comp. Vis. and Patt. Recog. (CVPR), 2018, pp. 586–595.
  41. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.

Summary

We haven't generated a summary for this paper yet.