Papers
Topics
Authors
Recent
Search
2000 character limit reached

Dual Propagation: Neural & Optical Systems

Updated 18 March 2026
  • Dual Propagation (DP) is a method that uses dyadic neuron compartments to encode both error and activity for efficient, single-phase learning in energy-based neural networks.
  • DP achieves BP-level performance by replacing dual-phase inference with simultaneous state encoding, enabling local Hebbian weight updates and reduced computational overhead.
  • In fiber optics, DP models dual-polarization signals in a four-dimensional space to accurately predict nonlinear interference, enhancing spectral efficiency and network design.

Dual Propagation (DP) refers to two distinct, context-specific technical concepts: (1) a biologically plausible alternative to back-propagation for energy-based neural networks based on compartmental dyadic neurons, and (2) a foundational signal-encoding approach in dual-polarization optical fiber communication systems with a four-dimensional modulation space. Each instantiation of dual propagation is central to its respective field and underpinned by precise mathematical formulations and empirical validations.

1. Dual Propagation in Energy-Based Neural Networks

The dual propagation algorithm for layered energy-based networks is constructed as a single-phase, activity-difference-based learning alternative to both contrastive Hebbian learning (CHL) and equilibrium propagation (EP) (Høier et al., 2023).

Theoretical Foundation and Motivation

Back-propagation (BP), while computationally efficient, is biologically implausible due to non-local weight transport and tightly coupled forward-backward synchronization. CHL and EP propose biologically inspired learning through dual inference phases (free and clamped/nudged), updating weights by correlational differences between phases. However, on digital hardware, CHL and EP require computationally expensive fixed-point inferences per phase, resulting in runtimes over 100x slower than BP. DP aims to eliminate this inefficiency by:

  • Encoding error-activity duality within dyadic neuron compartments, replacing temporal phase alternation with simultaneous state encoding.
  • Selecting an energy functional permitting layer-wise closed-form inference solutions, reducing runtime to that of BP.

Formulation with Dyadic Neurons

Each artificial neuron is a "dyad," maintaining two internal sub-states zk,i+z^{+}_{k,i} and zk,iz^{-}_{k,i}. The mean and difference states within layer kk are defined as:

  • zˉk=αzk++(1α)zk\bar{z}_k = \alpha z^{+}_k + (1-\alpha) z^{-}_k (activity encoding)
  • δk=(zkzk+)/βk\delta_k = (z^{-}_k - z^{+}_k)/\beta_k (error encoding)

with α[0,1]\alpha \in [0,1] (typically α=1/2\alpha=1/2), and βk>0\beta_k > 0 (nudging scale parameter).

The energy-based min–max objective for a sample (x,y)(x, y) is:

Lα(θ)=minz+maxz  {α(zL+)+(1α)(zL)+k=1L1βk[Gk(zk+)Gk(zk)+(zkzk+)Wk1zˉk1]}L_\alpha(\theta) = \min_{z^+} \max_{z^-} \; \left\{ \alpha \ell(z^+_L) + (1-\alpha) \ell(z^-_L) + \sum_{k=1}^{L} \frac{1}{\beta_k}\Big[G_k(z^+_k) - G_k(z^-_k) + (z^-_k - z^+_k)^\top W_{k-1}\bar{z}_{k-1}\Big] \right\}

where GkG_k is a strictly convex function inducing the inverse activation fk=(Gk)1f_k = (\nabla G_k)^{-1}.

Layer-Wise Inference in Closed Form

For α=1/2\alpha=1/2, the optimality conditions yield simultaneous, closed-form updates for all zk±z^\pm_k:

zk+fk(Wk1zˉk1+αβkβk+1Wk(zk+1+zk+1))z^+_k \leftarrow f_k\left(W_{k-1}\bar{z}_{k-1} + \frac{\alpha \beta_k}{\beta_{k+1}}W_k^\top (z^+_{k+1} - z^-_{k+1})\right)

zkfk(Wk1zˉk1(1α)βkβk+1Wk(zk+1+zk+1))z^-_k \leftarrow f_k\left(W_{k-1}\bar{z}_{k-1} - \frac{(1-\alpha)\beta_k}{\beta_{k+1}}W_k^\top (z^+_{k+1} - z^-_{k+1})\right)

For common loss functions (e.g., least-squares, linearized cross-entropy), the output layer updates have a closed form as well. Layer states are computed with one sequential forward sweep (means) and a backward sweep (differences); all z±z^\pm converge exactly to the block-wise optimizer, eliminating the need for iterative or dual-phase inference.

Weight Update Rules

Gradients can be computed locally:

ΔWk1=η[δkzˉk1]\Delta W_{k-1} = -\eta [\delta_k \bar{z}_{k-1}^\top]

where the error signal δk\delta_k is the normalized difference of the sub-states. This is a purely local Hebbian plasticity rule—synaptic change depends only on presynaptic and postsynaptic local variables.

2. Comparative Analysis with Back-Propagation, CHL, and EP

Method Phases Needed Update Mechanism Gradient Source Empirical Efficiency
BP Forward+Backward Global, non-local transport Exact 2 passes, O(L)
CHL, EP 2 Inference Phase difference Fixed point Iterative, slow
DP (this work) 1 phase, 2 sweeps Dyadic compartments Local activity diffs 2 passes, O(L)

DP matches BP in both runtime and test accuracy across standard benchmarks. On MNIST, VGG-16/CIFAR-10, VGG-16/CIFAR-100, and ImageNet32x32, test accuracy and epoch times are statistically indistinguishable between BP and DP, while CHL/EP require orders of magnitude more inference iterations and demonstrate significantly reduced accuracy (Høier et al., 2023).

3. Strengths, Limitations, and Extensions

Strengths

  • Single forward and backward sweeps, both in closed form, enable digital runtimes equivalent to BP.
  • Update mechanism is purely local, supporting Hebb-like plasticity and improved biological plausibility.
  • No requirement for explicit phase switching as in CHL/EP; both error and activity are encoded in neuron state.
  • Generalizes to any activation function defined via Fenchel–Young conjugacy.

Limitations

  • Preservation of biological plausibility for completely asymmetric weights requires either symmetric feedback or learned feedback propagation (e.g., Kolen–Pollack mechanism).
  • Sensitivity to the nudging parameter βL\beta_L; mis-setting can destabilize updates.
  • For non-smooth activations, the fixed-point inference is only approximate (subgradient-based).

Potential Extensions

  • Hardware-efficient analog or neuromorphic implementations based on dyadic, compartmental neuron circuits.
  • Quantized or spiking versions, with βk\beta_k controlling error discretization.
  • Hybrid DP variants employing learned feedback for full asymmetry.
  • Application to recurrent or time-convolutional architectures via reinterpretation of time as stacked layers.

4. Dual Propagation in Dual-Polarization Fiber Optics

In long-haul coherent fiber-optic systems, dual-polarization (DP) refers to encoding information on both polarization states, with each state comprising in-phase and quadrature components: yielding a four-dimensional (4D) modulation space. The term "dual propagation" is often encountered in this context, describing the concurrent transmission of independent signals along orthogonally polarized channels.

For analyzing nonlinear interference (NLI), the formulation of a closed-form perturbative model for DP-4D formats is essential. Liang et al. derived such a model for any DP-4D format with independent symbols, yielding expressions for NLI power generated by self-channel interference (SCI), cross-channel interference (XCI), and multiple-channel interference (MCI) (Liang et al., 2023).

The net NLI after NsN_s spans can be expressed as:

σss2=η~ssNs1+ϵP3,\sigma^2_{ss} = \tilde{\eta}_{ss} N_s^{1+\epsilon} P^3,

with the signal–noise interaction term included as:

σNLI2=σss2+σsn2,σsn2=3ξη~ssσASE2P2\sigma^2_{NLI} = \sigma^2_{ss} + \sigma^2_{sn}, \quad \sigma^2_{sn} = 3\xi\tilde{\eta}_{ss} \sigma^2_{ASE} P^2

where all required moments and triple-integral terms are specified in a unified frequency-domain framework. This closed-form model achieves prediction accuracy within 0.15 dB of split-step Fourier simulations across all tested DP-4D formats, fiber types, and channel counts.

5. Empirical Performance and Validation

For energy-based neural networks, empirical results demonstrate that dual propagation matches back-propagation in both accuracy and runtime across multiple computer vision datasets. For example:

  • MNIST MLP: BP test accuracy 98.45%±0.0498.45\% \pm 0.04, DP 98.43%±0.0398.43\% \pm 0.03; gradient angle between BP and DP <11.5<11.5^\circ throughout training.
  • VGG-16/CIFAR-10: BP 92.26%±0.2392.26\% \pm 0.23, DP 92.30%±0.1192.30\% \pm 0.11; epoch time 3.5\sim 3.5s.
  • VGG-16/ImageNet32x32: BP 41.28%±0.1941.28\% \pm 0.19, DP 41.48%±0.1941.48\% \pm 0.19; epoch time 61\sim 61s.

In all cases, DP achieves parity with BP, whereas EP and CHL trails by significant margins in both accuracy and computational cost (Høier et al., 2023).

For fiber optics, model predictions for NLI (including all three nonlinear interaction types and signal–noise beating) reduce SNR and reach prediction error to within $0.15$ dB and 4%4\% respectively, compared to simulation (Liang et al., 2023).

6. Significance and Impact

Dual propagation, as formalized in energy-based networks, is a major step toward reconciling computational efficiency with biological plausibility. By achieving BP-level performance without non-local weight transport or temporal phase alternation, DP offers new avenues for hardware-efficient, robust training regimes and aligns more closely with compartmental neuronal architectures.

In coherent fiber communications, accurate closed-form modeling of NLI in the DP-4D regime underpins system design for maximizing spectral efficiency and reach. The generality and precision of the dual-polarization NLI model support advanced modulation formats and enable more reliable network planning in diverse fiber environments.

Both interpretations of dual propagation are defined by rigorous mathematical constructs, provable computational advantages, and thorough empirical validation on canonical benchmarks in their respective domains (Høier et al., 2023, Liang et al., 2023).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Dual Propagation (DP).