EqProp: Local Learning for Energy Models
- Equilibrium Propagation is a learning framework that computes local gradients using a two-phase process (free and nudged phases) to train neural and physical systems.
- It bridges traditional backpropagation and biologically plausible rules while extending to recurrent, convolutional, spiking, quantum, and analog architectures.
- Its hardware-friendly design enables energy-efficient, in-memory computation with robust performance even under device nonidealities.
Equilibrium Propagation (EqProp) is a learning framework for energy-based models that enables local, hardware-friendly computation of gradients for training neural and physical systems. By formulating learning as the response of physical networks to a small “nudging” perturbation, EqProp bridges the gap between backpropagation and biologically or physically plausible learning rules. EqProp has been extended to recurrent, convolutional, spiking, oscillator, Lagrangian, quantum, and analog hardware systems, with demonstrated scalability to deep architectures and robustness to hardware nonidealities.
1. Core Principles of Equilibrium Propagation
A system described by state variables (e.g. neuron activations), fixed inputs , and parameters is assigned an energy . EqProp proceeds in two main phases:
- Free phase: The input is clamped and the system relaxes to an energy minimum , where the output prediction is read.
- Nudged (weakly-clamped) phase: A small scalar is introduced, coupling a supervised cost to the energy: . The system is nudged to a nearby equilibrium .
The key theoretical result is that the gradient of the loss with respect to parameters is given by
This can be evaluated by measuring local observables at the equilibrium points in both phases, yielding a learning rule that does not require explicit backpropagation through the network (Scellier et al., 2016, Peters et al., 28 Mar 2025, Scellier et al., 2017).
2. Mathematical Structure and Generalizations
The EqProp gradient derivation exploits the symmetry of mixed partial derivatives:
This forms the basis for the Hebbian-like local learning rule:
where is a pointwise nonlinearity. This estimate is unbiased up to . Centered (symmetric) difference estimators with nudges at and further reduce bias to (Laborieux et al., 2021).
EqProp's gradient equivalence to backpropagation is exact in the limit of small nudging () and for smooth energy functions with unique equilibria. For recurrent and partially asymmetric systems, EqProp matches the gradient of Backpropagation Through Time (BPTT) under infinitesimal nudging and vanishingly slow learning rates (Ernoult et al., 2020).
Generalizations of EqProp address:
- Directed/Asymmetric architectures: Via modified neuron dynamics and Oja-like plasticity (Farinha et al., 2020, Ernoult et al., 2020).
- Spiking neural networks: By mapping time-averaged spike rates and local eligibility traces to the EqProp learning rule (Martin et al., 2020, Lin et al., 4 May 2024).
- Sequence models: Through convergent Hopfield or RNN layers with attention-like mechanisms (Bal et al., 2022).
- Thermal and Quantum systems: Extending the framework to finite-temperature Boltzmann equilibrium and quantum ground states (Massar et al., 14 May 2024, Scellier, 2 Jun 2024).
3. Extensions to Physical and Analog Neural Systems
EqProp is uniquely suited to in-situ learning in physical and analog substrates. Its two-phase protocol relies only on physical observables (local voltages, currents, or occupation numbers) during system relaxation.
- Analog resistive and memristor networks: EqProp computes the update of each conductance via the squared difference of voltage drops across a resistor in the two phases, scaled by the nudging factor (Kendall et al., 2020, Döll et al., 13 Dec 2025). Convergence is robust to strong nonlinearity in device physics, provided the programmable conductance ranges over at least an order of magnitude (Döll et al., 13 Dec 2025).
- Oscillator and Ising machines: EqProp adapts to phase or binary-spin variables. Weight gradients are obtained by measuring correlation differences of spin or oscillator observables before/after nudging. Experimental demonstrations on D-Wave quantum annealers and oscillator arrays achieve state-of-the-art accuracy on MNIST-scale tasks (Laydevant et al., 2023, Rageau et al., 16 Apr 2025).
- Lagrangian, dynamical, or periodic systems: The principle of EqProp translates to trajectories optimizing an action. The gradient with respect to parameters is given by the difference in the conjugate momenta (derivatives of the Lagrangian) integrated over the nudged and free trajectories (Massar, 12 May 2025, Berneman et al., 25 Jun 2025).
4. Algorithmic Variants and Scalability
Several algorithmic advancements have addressed the computational challenges and scalability of EqProp:
- Discrete-time and continual updates: Discrete EqProp simplifies the state updates for compatibility with modern hardware (e.g., CNNs) and enables “continual EqProp,” where synaptic changes occur in real-time using only locally available instantaneous states (Ernoult et al., 2020).
- Bias correction: Centered difference estimators remove leading-order bias in the EqProp gradient when using finite (Laborieux et al., 2021).
- Deeper architectures: Introduction of residual (skip) connections and clipped ReLU activations enables convergence and performance parity with backpropagation in networks up to ResNet13 depth, significantly expanding the range of tasks EqProp can solve (P et al., 30 Sep 2025).
- Sequence and attention models: Integrating contractive Hopfield attention allows EqProp to handle complex NLP sequence tasks (Bal et al., 2022).
- Oscillatory and holomorphic frameworks: Holomorphic EqProp (hEP) replaces the two-phase protocol by encoding the loss gradient in the first Fourier mode of activity oscillations, supporting robust gradient estimation even for finite nudges and high noise (Laborieux et al., 2022).
5. Biological and Neuromorphic Significance
The EqProp learning rule is spatially and—through continual formulations—temporally local, requiring only pre- and post-synaptic signals. In contrast to standard backpropagation, EqProp does not require a dedicated backward pass or global memory of activations, aligning with the constraints of biologically plausible synaptic plasticity and neuromorphic implementation (Scellier et al., 2016, Martin et al., 2020). In spike-based networks, the EqProp update rule emerges as a form of Spike-Timing Dependent Plasticity (STDP).
These features enable highly energy-efficient in-memory or event-driven learning on custom hardware such as memristors, analog crossbars, or spintronic/oscillator arrays. Empirical studies document several orders of magnitude energy savings for EqProp-based training and inference compared to GPU-based backpropagation, assuming sufficient device reliability (Martin et al., 2020, Döll et al., 13 Dec 2025, Kendall et al., 2020).
Furthermore, EqProp naturally synergizes learning and synchronization in coupled-oscillator systems, providing a route to scalable, robust analog AI hardware (Rageau et al., 16 Apr 2025).
6. Theoretical Connections and Limitations
EqProp unifies a spectrum of local learning principles, including:
- Contrastive Hebbian Learning (CHL): Recovers the CHL rule but with an infinitesimal, rather than fully clamped, output nudge for unbiased gradient estimation (Scellier et al., 2016).
- Recurrent Backpropagation: EqProp’s transient neural activity in the nudged phase equals the error derivatives from RBP in the original backprop network, obviating the need for a separate “error circuit” (Scellier et al., 2017, Ernoult et al., 2020).
- Thermodynamic learning: In stochastic or finite-temperature regimes, the EqProp update is a covariance between the observable and cost within the unclamped Boltzmann ensemble (Massar et al., 14 May 2024).
Key limitations include:
- Necessity for symmetric weights (though partially relaxed in recent asymmetric generalizations (Ernoult et al., 2020, Farinha et al., 2020)).
- Requirement for convergent dynamics and unique fixed points.
- Small nudging for unbiased gradients; finite can introduce bias, so robust estimation techniques are required.
- For spiking and non-differentiable architectures, additional care in pooling/unpooling, state encoding, and gradient approximation is required (Lin et al., 4 May 2024).
7. Empirical Performance, Robustness, and Outlook
EqProp-trained systems match or approach the performance of backpropagation across domains:
- Dense, convolutional, and residual networks on MNIST and CIFAR-10/100 (2–3% error gap for deep CNNs; parity in residual architectures) (P et al., 30 Sep 2025, Laborieux et al., 2021).
- SNNs trained with EqProp versus BPTT achieve state-of-the-art accuracy and memory efficiency on MNIST/FashionMNIST (Lin et al., 4 May 2024).
- Robustness up to a critical noise threshold, with optimal learning observed at finite noise—corresponding to built-in regularization and hardware uncertainty tolerance (Peters et al., 28 Mar 2025).
Current research extends EqProp to:
- Quantum/finite-temperature and Lagrangian systems (Scellier, 2 Jun 2024, Massar et al., 14 May 2024, Massar, 12 May 2025).
- Hardware demonstrations in analog, Ising, oscillator, and memristive devices (Kendall et al., 2020, Laydevant et al., 2023, Döll et al., 13 Dec 2025, Rageau et al., 16 Apr 2025).
- Deep recurrent and sequence-processing networks (Bal et al., 2022).
These advances position EqProp as a leading biologically and physically plausible learning principle at the intersection of neuromorphic engineering, analog AI devices, and theoretical neuroscience.