Dual Propagation: Neural & Optical Systems
- Dual Propagation (DP) is a method that uses dyadic neuron compartments to encode both error and activity for efficient, single-phase learning in energy-based neural networks.
- DP achieves BP-level performance by replacing dual-phase inference with simultaneous state encoding, enabling local Hebbian weight updates and reduced computational overhead.
- In fiber optics, DP models dual-polarization signals in a four-dimensional space to accurately predict nonlinear interference, enhancing spectral efficiency and network design.
Dual Propagation (DP) refers to two distinct, context-specific technical concepts: (1) a biologically plausible alternative to back-propagation for energy-based neural networks based on compartmental dyadic neurons, and (2) a foundational signal-encoding approach in dual-polarization optical fiber communication systems with a four-dimensional modulation space. Each instantiation of dual propagation is central to its respective field and underpinned by precise mathematical formulations and empirical validations.
1. Dual Propagation in Energy-Based Neural Networks
The dual propagation algorithm for layered energy-based networks is constructed as a single-phase, activity-difference-based learning alternative to both contrastive Hebbian learning (CHL) and equilibrium propagation (EP) (Høier et al., 2023).
Theoretical Foundation and Motivation
Back-propagation (BP), while computationally efficient, is biologically implausible due to non-local weight transport and tightly coupled forward-backward synchronization. CHL and EP propose biologically inspired learning through dual inference phases (free and clamped/nudged), updating weights by correlational differences between phases. However, on digital hardware, CHL and EP require computationally expensive fixed-point inferences per phase, resulting in runtimes over 100x slower than BP. DP aims to eliminate this inefficiency by:
- Encoding error-activity duality within dyadic neuron compartments, replacing temporal phase alternation with simultaneous state encoding.
- Selecting an energy functional permitting layer-wise closed-form inference solutions, reducing runtime to that of BP.
Formulation with Dyadic Neurons
Each artificial neuron is a "dyad," maintaining two internal sub-states and . The mean and difference states within layer are defined as:
- (activity encoding)
- (error encoding)
with (typically ), and (nudging scale parameter).
The energy-based min–max objective for a sample is:
where is a strictly convex function inducing the inverse activation .
Layer-Wise Inference in Closed Form
For , the optimality conditions yield simultaneous, closed-form updates for all :
For common loss functions (e.g., least-squares, linearized cross-entropy), the output layer updates have a closed form as well. Layer states are computed with one sequential forward sweep (means) and a backward sweep (differences); all converge exactly to the block-wise optimizer, eliminating the need for iterative or dual-phase inference.
Weight Update Rules
Gradients can be computed locally:
where the error signal is the normalized difference of the sub-states. This is a purely local Hebbian plasticity rule—synaptic change depends only on presynaptic and postsynaptic local variables.
2. Comparative Analysis with Back-Propagation, CHL, and EP
| Method | Phases Needed | Update Mechanism | Gradient Source | Empirical Efficiency |
|---|---|---|---|---|
| BP | Forward+Backward | Global, non-local transport | Exact | 2 passes, O(L) |
| CHL, EP | 2 Inference | Phase difference | Fixed point | Iterative, slow |
| DP (this work) | 1 phase, 2 sweeps | Dyadic compartments | Local activity diffs | 2 passes, O(L) |
DP matches BP in both runtime and test accuracy across standard benchmarks. On MNIST, VGG-16/CIFAR-10, VGG-16/CIFAR-100, and ImageNet32x32, test accuracy and epoch times are statistically indistinguishable between BP and DP, while CHL/EP require orders of magnitude more inference iterations and demonstrate significantly reduced accuracy (Høier et al., 2023).
3. Strengths, Limitations, and Extensions
Strengths
- Single forward and backward sweeps, both in closed form, enable digital runtimes equivalent to BP.
- Update mechanism is purely local, supporting Hebb-like plasticity and improved biological plausibility.
- No requirement for explicit phase switching as in CHL/EP; both error and activity are encoded in neuron state.
- Generalizes to any activation function defined via Fenchel–Young conjugacy.
Limitations
- Preservation of biological plausibility for completely asymmetric weights requires either symmetric feedback or learned feedback propagation (e.g., Kolen–Pollack mechanism).
- Sensitivity to the nudging parameter ; mis-setting can destabilize updates.
- For non-smooth activations, the fixed-point inference is only approximate (subgradient-based).
Potential Extensions
- Hardware-efficient analog or neuromorphic implementations based on dyadic, compartmental neuron circuits.
- Quantized or spiking versions, with controlling error discretization.
- Hybrid DP variants employing learned feedback for full asymmetry.
- Application to recurrent or time-convolutional architectures via reinterpretation of time as stacked layers.
4. Dual Propagation in Dual-Polarization Fiber Optics
In long-haul coherent fiber-optic systems, dual-polarization (DP) refers to encoding information on both polarization states, with each state comprising in-phase and quadrature components: yielding a four-dimensional (4D) modulation space. The term "dual propagation" is often encountered in this context, describing the concurrent transmission of independent signals along orthogonally polarized channels.
For analyzing nonlinear interference (NLI), the formulation of a closed-form perturbative model for DP-4D formats is essential. Liang et al. derived such a model for any DP-4D format with independent symbols, yielding expressions for NLI power generated by self-channel interference (SCI), cross-channel interference (XCI), and multiple-channel interference (MCI) (Liang et al., 2023).
The net NLI after spans can be expressed as:
with the signal–noise interaction term included as:
where all required moments and triple-integral terms are specified in a unified frequency-domain framework. This closed-form model achieves prediction accuracy within 0.15 dB of split-step Fourier simulations across all tested DP-4D formats, fiber types, and channel counts.
5. Empirical Performance and Validation
For energy-based neural networks, empirical results demonstrate that dual propagation matches back-propagation in both accuracy and runtime across multiple computer vision datasets. For example:
- MNIST MLP: BP test accuracy , DP ; gradient angle between BP and DP throughout training.
- VGG-16/CIFAR-10: BP , DP ; epoch time s.
- VGG-16/ImageNet32x32: BP , DP ; epoch time s.
In all cases, DP achieves parity with BP, whereas EP and CHL trails by significant margins in both accuracy and computational cost (Høier et al., 2023).
For fiber optics, model predictions for NLI (including all three nonlinear interaction types and signal–noise beating) reduce SNR and reach prediction error to within $0.15$ dB and respectively, compared to simulation (Liang et al., 2023).
6. Significance and Impact
Dual propagation, as formalized in energy-based networks, is a major step toward reconciling computational efficiency with biological plausibility. By achieving BP-level performance without non-local weight transport or temporal phase alternation, DP offers new avenues for hardware-efficient, robust training regimes and aligns more closely with compartmental neuronal architectures.
In coherent fiber communications, accurate closed-form modeling of NLI in the DP-4D regime underpins system design for maximizing spectral efficiency and reach. The generality and precision of the dual-polarization NLI model support advanced modulation formats and enable more reliable network planning in diverse fiber environments.
Both interpretations of dual propagation are defined by rigorous mathematical constructs, provable computational advantages, and thorough empirical validation on canonical benchmarks in their respective domains (Høier et al., 2023, Liang et al., 2023).