Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Correlative Information Maximization: A Biologically Plausible Approach to Supervised Deep Neural Networks without Weight Symmetry (2306.04810v3)

Published 7 Jun 2023 in cs.NE, cs.IT, cs.LG, math.IT, and q-bio.NC

Abstract: The backpropagation algorithm has experienced remarkable success in training large-scale artificial neural networks; however, its biological plausibility has been strongly criticized, and it remains an open question whether the brain employs supervised learning mechanisms akin to it. Here, we propose correlative information maximization between layer activations as an alternative normative approach to describe the signal propagation in biological neural networks in both forward and backward directions. This new framework addresses many concerns about the biological-plausibility of conventional artificial neural networks and the backpropagation algorithm. The coordinate descent-based optimization of the corresponding objective, combined with the mean square error loss function for fitting labeled supervision data, gives rise to a neural network structure that emulates a more biologically realistic network of multi-compartment pyramidal neurons with dendritic processing and lateral inhibitory neurons. Furthermore, our approach provides a natural resolution to the weight symmetry problem between forward and backward signal propagation paths, a significant critique against the plausibility of the conventional backpropagation algorithm. This is achieved by leveraging two alternative, yet equivalent forms of the correlative mutual information objective. These alternatives intrinsically lead to forward and backward prediction networks without weight symmetry issues, providing a compelling solution to this long-standing challenge.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. Learning representations by back-propagating errors. Nature, 323(6088):533–536, 1986.
  2. Theories of error back-propagation in the brain. Trends in cognitive sciences, 23(3):235–250, 2019.
  3. Francis Crick. The recent excitement about neural networks. Nature, 337:129–132, 1989.
  4. Stephen Grossberg. Competitive learning: From interactive activation to adaptive resonance. Cognitive science, 11(1):23–63, 1987a.
  5. Equivalence of backpropagation and contrastive hebbian learning in a layered network. Neural computation, 15(2):441–454, 2003.
  6. Equilibrium propagation: Bridging the gap between energy-based models and backpropagation. Frontiers in computational neuroscience, 11:24, 2017a.
  7. Matthew Larkum. A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex. Trends in neurosciences, 36(3):141–151, 2013.
  8. Learning by the dendritic prediction of somatic spiking. Neuron, 81(3):521–528, 2014.
  9. Dendritic cortical microcircuits approximate the backpropagation algorithm. Advances in neural information processing systems, 31, 2018.
  10. Constrained predictive coding as a biologically plausible model of the cortical hierarchy. Advances in Neural Information Processing Systems, 35:14155–14169, 2022.
  11. Correlative information maximization based biologically plausible neural networks for correlated source separation. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=8JsaP7j1cL0.
  12. The subcellular organization of neocortical excitatory connections. Nature, 457(7233):1142–1145, 2009.
  13. Dendritic solutions to the credit assignment problem. Current opinion in neurobiology, 54:28–36, 2019.
  14. Somatostatin-expressing neurons in cortical networks. Nature Reviews Neuroscience, 17(7):401–409, 2016.
  15. Towards deep learning with segregated dendrites. Elife, 6:e22901, 2017.
  16. Stephen Grossberg. Competitive learning: From interactive activation to adaptive resonance. Cogn. Sci., 11:23–63, 1987b.
  17. Random synaptic feedback weights support error backpropagation for deep learning. Nature communications, 7(1):13276, 2016.
  18. Deep learning without weight transport. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/f387624df552cea2f369918c5e1e12bc-Paper.pdf.
  19. Yali Amit. Deep learning with asymmetric connections and hebbian updates. Frontiers in computational neuroscience, 13:18, 2019.
  20. How important is weight symmetry in backpropagation? Proceedings of the AAAI Conference on Artificial Intelligence, 30, 10 2015. doi: 10.1609/aaai.v30i1.10279.
  21. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature neuroscience, 2(1):79–87, 1999.
  22. An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity. Neural computation, 29(5):1229–1262, 2017.
  23. Can the brain do backpropagation?—exact implementation of backpropagation in predictive coding networks. Advances in neural information processing systems, 33:22566–22579, 2020.
  24. Equilibrium propagation: Bridging the gap between energy-based models and backpropagation. Frontiers in Computational Neuroscience, 11, 2017b. ISSN 1662-5188. doi: 10.3389/fncom.2017.00024. URL https://www.frontiersin.org/articles/10.3389/fncom.2017.00024.
  25. Scaling equilibrium propagation to deep convnets by drastically reducing its gradient estimator bias. Frontiers in Neuroscience, 15:633674, 02 2021. doi: 10.3389/fnins.2021.633674.
  26. Holomorphic equilibrium propagation computes exact gradients through finite size oscillations. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=7JqqnRrZfz6.
  27. Contrastive similarity matching for supervised learning. Neural computation, 33(5):1300–1328, 2021.
  28. Generalization of equilibrium propagation to vector field dynamics, 2018.
  29. Backpropagation without weight transport. In Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94), volume 3, pages 1375–1380 vol.3, 1994. doi: 10.1109/ICNN.1994.374486.
  30. Ralph Linsker. Self-organization in a perceptual network. Computer, 21(3):105–117, 1988.
  31. An information-maximization approach to blind separation and blind deconvolution. Neural computation, 7(6):1129–1159, 1995.
  32. Receptive fields of single neurones in the cat’s striate cortex. The Journal of physiology, 148:574–591, 1959.
  33. The “independent components” of natural scenes are edge filters. Vision research, 37(23):3327–3338, 1997.
  34. Learning deep representations by mutual information estimation and maximization. International Conference on Learning Representations (ICLR), 2019.
  35. Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature, 355(6356):161–163, 1992.
  36. Alper T Erdogan. An information maximization based blind source separation approach for dependent and independent sources. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4378–4382, 2022. doi: 10.1109/ICASSP43922.2022.9746099.
  37. Self-supervised learning with an information maximization criterion. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=5MgZAu2NR7X.
  38. Erkki Oja. Simplified neuron model as a principal component analyzer. Journal of mathematical biology, 15:267–273, 1982.
  39. A biologically plausible neural network for multichannel canonical correlation analysis. Neural Computation, 33(9):2309–2352, 2021.
  40. Polytopic matrix factorization: Determinant maximization based criterion and identifiability. IEEE Transactions on Signal Processing, 69:5431–5447, 2021. doi: 10.1109/TSP.2021.3112918.
  41. Mark D Plumbley. Algorithms for nonnegative independent component analysis. IEEE Transactions on Neural Networks, 14(3):534–543, 2003.
  42. Blind nonnegative source separation using biological neural networks. Neural computation, 29(11):2925–2954, 2017.
  43. Disentanglement with biological constraints: A theory of functional cell types. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=9Z_GfhZnGH.
  44. Frequency and dendritic distribution of autapses established by layer 5 pyramidal neurons in the developing rat neocortex: comparison with synaptic innervation of adjacent neurons of the same class. Journal of Neuroscience, 16(10):3209–3218, 1996.
  45. Randall C O’Reilly. Biologically plausible error-driven learning using local activation differences: The generalized recirculation algorithm. Neural computation, 8(5):895–938, 1996.
  46. Contrastive learning and neural oscillations. Neural Computation, 3(4):526–545, 1991. doi: 10.1162/neco.1991.3.4.526.
  47. Theta coordinated error-driven learning in the hippocampus. PLoS computational biology, 9(6):e1003067, 2013.
  48. The role of phase synchronization in memory processes. Nature reviews. Neuroscience, 12:105–18, 02 2011. doi: 10.1038/nrn2979.
  49. Dynamic predictions: oscillations and synchrony in top-down processing. Nature reviews. Neuroscience, 2:704–716, 11 2001. doi: 10.1038/35094565.
  50. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/, 2010. URL http://yann.lecun.com/exdb/mnist/.
  51. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017.
  52. Learning multiple layers of features from tiny images, 2009.
  53. Backpropagation at the infinitesimal inference limit of energy-based models: Unifying predictive coding, equilibrium propagation, and contrastive hebbian learning. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=nIMifqu2EO.
  54. Understanding neural networks with logarithm determinant entropy estimator. arXiv preprint arXiv:1401.3420, 2021.
  55. Linear estimation. Prentice-Hall information and system sciences series. Prentice Hall, 2000. ISBN 9780130224644.
  56. Robust volume minimization-based matrix factorization for remote sensing and document clustering. IEEE Transactions on Signal Processing, 64(23):6254–6268, 2016.
  57. Arne Brondsted. An Introduction to Convex Polytopes, volume 90. Springer Science & Business Media, 2012.
  58. David L Donoho. For most large underdetermined systems of equations, the minimal ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-norm near-solution approximates the sparsest near-solution. Communications on Pure and Applied Mathematics, 59(7):907–934, July 2006.
  59. Michael Elad. Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer Science & Business Media, 2010.
  60. Efficient projections onto the ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-ball for learning in high dimensions. In Proceedings of the 25th International Conference on Machine learning, pages 272–279, July 2008.
  61. An algorithmic framework for sparse bounded component analysis. IEEE Transactions on Signal Processing, 66(19):5194–5205, August 2018.
  62. Democratic representations. arXiv preprint arXiv:1401.3420, 2014.
  63. Alper T Erdogan. A class of bounded component analysis algorithms for the separation of both independent and dependent sources. IEEE Transactions on Signal Processing, 61(22):5730–5743, August 2013.
  64. A convolutive bounded component analysis framework for potentially nonstationary independent and/or dependent sources. IEEE Transactions on Signal Processing, 63(1):18–30, November 2014.
  65. Biologically-plausible determinant maximization neural networks for blind separation of correlated sources. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=espX_4CLr46.
  66. Sparse coding via thresholding and local competition in neural circuits. Neural computation, 20(10):2526–2563, 2008.
  67. Convex optimization. Cambridge university press, 2004.
  68. Proximal algorithms. Foundations and trends® in Optimization, 1(3):127–239, 2014.
  69. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
Citations (1)

Summary

We haven't generated a summary for this paper yet.