Predictive Coding: Towards a Future of Deep Learning beyond Backpropagation? (2202.09467v1)

Published 18 Feb 2022 in cs.NE, cs.AI, and cs.LG

Abstract: The backpropagation of error algorithm used to train deep neural networks has been fundamental to the successes of deep learning. However, it requires sequential backward updates and non-local computations, which make it challenging to parallelize at scale and is unlike how learning works in the brain. Neuroscience-inspired learning algorithms, however, such as \emph{predictive coding}, which utilize local learning, have the potential to overcome these limitations and advance beyond current deep learning technologies. While predictive coding originated in theoretical neuroscience as a model of information processing in the cortex, recent work has developed the idea into a general-purpose algorithm able to train neural networks using only local computations. In this survey, we review works that have contributed to this perspective and demonstrate the close theoretical connections between predictive coding and backpropagation, as well as works that highlight the multiple advantages of using predictive coding models over backpropagation-trained neural networks. Specifically, we show the substantially greater flexibility of predictive coding networks against equivalent deep neural networks, which can function as classifiers, generators, and associative memories simultaneously, and can be defined on arbitrary graph topologies. Finally, we review direct benchmarks of predictive coding networks on machine learning classification tasks, as well as its close connections to control theory and applications in robotics.

Citations (36)

View on Semantic Scholar

Summary

The paper demonstrates that predictive coding can emulate backpropagation via local update mechanisms like Z-IL, offering a brain-inspired learning model.
It provides empirical evidence on datasets such as MNIST, revealing that predictive coding networks perform competitively while supporting versatile tasks like classification and denoising.
The study highlights the potential for scalable, parallel architectures and neuromorphic computing by overcoming backpropagation’s non-local update challenges.

Essay: Predictive Coding as an Alternative to Backpropagation in Deep Learning

The paper "Predictive Coding: Towards a Future of Deep Learning beyond Backpropagation?" by Beren Millidge et al. explores the potential of predictive coding (PC) as a substitute for the traditional backpropagation (BP) algorithm in training deep neural networks. The authors focus on the limitations of BP and the advantages of PC, motivated by the way learning occurs in the human brain. While BP requires non-local computations and sequential updates, PC is characterized by local updates and, theoretically, more closely resembles brain functionality. This investigation outlines both the theoretical parallels and empirical performances of predictive coding, suggesting its potential as a viable alternative to current deep learning paradigms.

Theoretical Insights and Connections

Predictive coding originated in theoretical neuroscience, intended to model cortical processing in the brain. The core concept involves treating the brain as performing simultaneous inference and learning on hierarchical probabilistic generative models. The crucial advantage here is the utilization of local computations as opposed to the non-local updates demanded by BP. This paper presents an exhaustive review of literature connecting PC to normative theories like the Bayesian brain hypothesis, significantly enhancing its mathematical rigour and standing as a learning framework.

One of the substantial theoretical achievements of this paper is the detailed exploration of the convergence between PC and BP. Previous research has traced approximate equivalence between PC and BP under certain conditions on multi-layer perceptrons (MLPs) and arbitrary computation graphs. Intriguingly, variations such as Z-IL enable exact emulation of BP techniques within predictive coding frameworks. The implications of these findings are profound, promising networks that maintain the performance of BP while leveraging local updates, potentially reimagining parallel computation paradigms in neuromorphic hardware.

Empirical Performance and Flexibility

Empirically, predictive coding networks (PCNs) have demonstrated competitive performance on classical image recognition datasets, echoing the capabilities of BP-trained networks. Notably, PCNs exhibit significant versatility, acting effectively as classifiers, generators, and associative memories concurrently. This multi-modality exemplifies a fundamental advantage over BP-trained networks, which typically require separate training for distinct tasks. PCNs have been tested on datasets like MNIST and FashionMNIST, displaying commendable results that underline their resourcefulness for varied machine learning challenges.

Particularly worth noting is the superior flexibility and generalization exhibited by PCNs. By operating as probabilistic generative models, PCNs have shown an ability to generalize well on previously unencountered tasks. This trait extends their utility beyond benchmarks by tackling tasks such as image reconstruction and denoising without prior specific training—a capability conventional ANNs lack.

Practical Implications and Future Directions

A prospect raised by the paper is predictive coding's potential to bypass limitations traditionally associated with BP, especially concerning the scalability of training processes. The local update nature of PCNs promises enhanced parallelization prospects in computational hardware, lowering the memory bandwidth barriers in highly scaled ANNs. Furthermore, PC can be directly adapted to control and robotics tasks, emphasizing its versatility beyond conventional applications like image classification.

The research encourages future exploration focusing on extending PC's applicability, especially within novel architectures or tasks not optimally suited for BP. Improving understanding of PCNs in relaxed conditions, creating predictive coding-specific optimizations, and leveraging PC for new neural architectures is seen as promising future work. Additionally, the potential of PCNs in neuromorphic computing environments underscores a practical direction seeking more efficient computation models, pivotal as GPU constraints become more pronounced.

Conclusion

The paper provides a comprehensive survey of predictive coding, examining its potential to redefine learning in artificial neural networks. By emphasizing the parallels with biological learning and investigating superior scalability and flexibility, PCNs hold promise in overcoming BP's critical limitations. As both theoretical and empirical evidence accumulates, predictive coding stands on the cusp of offering innovative solutions in the landscape of deep learning, warranting continued exploration and development.

PDF Markdown

Related Papers

Tweets

https://twitter.com/basedneoleo/status/1913361192476614926

https://twitter.com/basedneoleo/status/1886020895321247985

https://twitter.com/basedneoleo/status/1829247137340645703