Papers
Topics
Authors
Recent
2000 character limit reached

Energy-based learning algorithms for analog computing: a comparative study (2312.15103v1)

Published 22 Dec 2023 in cs.LG and cs.CV

Abstract: Energy-based learning algorithms have recently gained a surge of interest due to their compatibility with analog (post-digital) hardware. Existing algorithms include contrastive learning (CL), equilibrium propagation (EP) and coupled learning (CpL), all consisting in contrasting two states, and differing in the type of perturbation used to obtain the second state from the first one. However, these algorithms have never been explicitly compared on equal footing with same models and datasets, making it difficult to assess their scalability and decide which one to select in practice. In this work, we carry out a comparison of seven learning algorithms, namely CL and different variants of EP and CpL depending on the signs of the perturbations. Specifically, using these learning algorithms, we train deep convolutional Hopfield networks (DCHNs) on five vision tasks (MNIST, F-MNIST, SVHN, CIFAR-10 and CIFAR-100). We find that, while all algorithms yield comparable performance on MNIST, important differences in performance arise as the difficulty of the task increases. Our key findings reveal that negative perturbations are better than positive ones, and highlight the centered variant of EP (which uses two perturbations of opposite sign) as the best-performing algorithm. We also endorse these findings with theoretical arguments. Additionally, we establish new SOTA results with DCHNs on all five datasets, both in performance and speed. In particular, our DCHN simulations are 13.5 times faster with respect to Laborieux et al. (2021), which we achieve thanks to the use of a novel energy minimisation algorithm based on asynchronous updates, combined with reduced precision (16 bits).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. L. B. Almeida. A learning rule for asynchronous perceptrons with feedback in a combinatorial environment. In Proceedings, 1st First International Conference on Neural Networks, volume 2, pages 609–618. IEEE, 1987.
  2. Frequency propagation: Multi-mechanism learning in nonlinear physical networks. arXiv preprint arXiv:2208.08862, 2022.
  3. Learning by non-interfering feedback chemical signaling in physical networks. Physical Review Research, 5(2):023024, 2023.
  4. P. Baldi and F. Pineda. Contrastive learning and neural oscillations. Neural Computation, 3(4):526–545, 1991.
  5. Demonstration of decentralized physics-driven learning. Physical Review Applied, 18(1):014040, 2022.
  6. Circuits that train themselves: decentralized, physics-driven learning. In AI and Optical Data Sciences IV, volume 12438, pages 115–117. SPIE, 2023.
  7. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  8. Y. Du and I. Mordatch. Implicit generation and modeling with energy based models. Advances in Neural Information Processing Systems, 32, 2019.
  9. Updates of equilibrium prop match gradients of backprop through time in an rnn with static input. In Advances in Neural Information Processing Systems, pages 7079–7089, 2019.
  10. Bounds all around: training energy-based models with bidirectional bounds. Advances in Neural Information Processing Systems, 34:19808–19821, 2021.
  11. Your classifier is secretly an energy based model and you should treat it like one. arXiv preprint arXiv:1912.03263, 2019.
  12. D. Hexner. Training precise stress patterns. Soft Matter, 19(11):2120–2126, 2023.
  13. Boltzmann machines: Constraint satisfaction networks that learn. Carnegie-Mellon University, Department of Computer Science Pittsburgh, PA, 1984.
  14. Dual propagation: Accelerating contrastive hebbian learning with dyadic neurons. In Proceedings of the 40th International Conference on Machine Learning, pages 13141–13156, 2023.
  15. J. J. Hopfield. Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the national academy of sciences, 81(10):3088–3092, 1984.
  16. J. Kendall. A gradient estimator for time-varying electrical networks with non-linear dissipation. arXiv preprint arXiv:2103.05636, 2021.
  17. Training end-to-end analog neural networks with equilibrium propagation. arXiv preprint arXiv:2006.01981, 2020.
  18. Impacts of Feedback Current Value and Learning Rate on Equilibrium Propagation Performance. In 2022 20th IEEE Interregional NEWCAS Conference (NEWCAS), pages 519–523, Quebec City, Canada, June 2022. IEEE. doi: 10.1109/NEWCAS52662.2022.9842178. URL https://hal.telecom-paris.fr/hal-03779416.
  19. Learning multiple layers of features from tiny images. 2009. URL https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
  20. A. Laborieux and F. Zenke. Holomorphic equilibrium propagation computes exact gradients through finite size oscillations. Advances in Neural Information Processing Systems, 35:12950–12963, 2022.
  21. Scaling equilibrium propagation to deep convnets by drastically reducing its gradient estimator bias. Frontiers in neuroscience, 15:129, 2021.
  22. Training dynamical binary neural networks with equilibrium propagation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4640–4649, 2021.
  23. Training an ising machine with equilibrium propagation. arXiv preprint arXiv:2305.18321, 2023.
  24. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  25. Backpropagation and the brain. Nature Reviews Neuroscience, pages 1–12, 2020.
  26. Neurons learn by predicting future activity. Nature Machine Intelligence, 4(1):62–72, 2022.
  27. The least-control principle for local learning at equilibrium. Advances in Neural Information Processing Systems, 35:33603–33617, 2022.
  28. Backpropagation at the infinitesimal inference limit of energy-based models: Unifying predictive coding, equilibrium propagation, and contrastive hebbian learning. arXiv preprint arXiv:2206.02629, 2022.
  29. J. R. Movellan. Contrastive hebbian learning in the continuous hopfield model. In Connectionist Models, pages 10–17. Elsevier, 1991.
  30. Reading digits in natural images with unsupervised feature learning. 2011.
  31. Learning non-convergent non-persistent short-run mcmc toward energy-based model. Advances in Neural Information Processing Systems, 32, 2019.
  32. S. Park and O. Simeone. Predicting flat-fading channels via meta-learned closed-form linear filters and equilibrium propagation. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8817–8821. IEEE, 2022.
  33. Automatic differentiation in pytorch. 2017.
  34. E. Peterson and A. Lavin. Physical computing for materials acceleration platforms. Matter, 5(11):3586–3596, 2022.
  35. F. J. Pineda. Generalization of back-propagation to recurrent neural networks. Physical review letters, 59(19):2229, 1987.
  36. B. Scellier. A deep learning theory for neural networks grounded in physics. PhD thesis, Université de Montréal, 2021.
  37. B. Scellier and Y. Bengio. Equilibrium propagation: Bridging the gap between energy-based models and backpropagation. Frontiers in computational neuroscience, 11:24, 2017.
  38. Agnostic physics-driven deep learning. arXiv preprint arXiv:2205.15021, 2022.
  39. Training deep neural networks via direct loss minimization. In International conference on machine learning, pages 2169–2177. PMLR, 2016.
  40. M. Stern and A. Murugan. Learning without neurons in physical systems. Annual Review of Condensed Matter Physics, 14:417–441, 2023.
  41. Supervised learning in physical networks: From machine learning to learning machines. Physical Review X, 11(2):021045, 2021.
  42. Physical learning beyond the quasistatic limit. Physical Review Research, 4(2):L022037, 2022.
  43. Physical learning of power-efficient solutions. arXiv preprint arXiv:2310.10437, 2023.
  44. Energy-based analog neural network framework. Frontiers in Computational Neuroscience, 17:1114651, 2023.
  45. J. C. Whittington and R. Bogacz. Theories of error back-propagation in the brain. Trends in cognitive sciences, 23(3):235–250, 2019.
  46. Desynchronous learning in a physics-driven learning network. The Journal of Chemical Physics, 156(14):144903, 2022.
  47. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  48. Activity-difference training of deep neural networks using memristor crossbars. Nature Electronics, 6(1):45–51, 2023.
  49. N. Zucchet and J. Sacramento. Beyond backpropagation: bilevel optimization through implicit differentiation and equilibrium propagation. Neural Computation, 34(12):2309–2346, 2022.
  50. A contrastive rule for meta-learning. Advances in Neural Information Processing Systems, 35:25921–25936, 2022.
Citations (17)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.