Bidirectional Liquid Neural Networks (BiLNN)
- BiLNN is a continuous-time recurrent architecture that fuses learnable ODE dynamics with bidirectional propagation to enforce both Dirichlet and asymptotic boundary conditions.
- It maps complex optical potential parameterizations to nuclear scattering wave functions, achieving sub‑percent error across diverse energies and nuclear species.
- The differentiable design supports gradient‑based optimization and uncertainty quantification, serving as a robust surrogate in nuclear data evaluation.
A Bidirectional Liquid Neural Network (BiLNN) is a class of continuous-time recurrent neural architectures designed for the differentiable emulation of physical boundary-value problems, exemplified by its application to global nucleon-nucleus optical model calculations. BiLNNs synthesize liquid (continuous-time ODE-driven) recurrence and bidirectional propagation to provide a mapping from complex optical potential parameterizations to scattering wave functions, while satisfying physical boundary conditions and preserving analytical differentiability. The architecture enables gradient-based optimization and uncertainty quantification in nuclear modeling, producing observables with sub-percent error and demonstrating transferability across a broad parameter space, including extrapolation to untrained nuclear species (Lei, 27 Dec 2025).
1. Architectural Foundations and Relationship to Liquid/Reservoir Computing
The BiLNN architecture generalizes reservoir computing by employing learnable continuous-time ordinary differential equation (ODE) dynamics and by enforcing bidirectional recurrence tailored to physics boundary-value problems. In contrast to discrete-time @@@@1@@@@ (LSTM/GRU), the BiLNN hidden state evolves according to a first-order ODE: where is a learned leak gate and is a learned candidate drive. This ODE form admits a closed-form solution over each step , mitigating vanishing/exploding gradient pathologies on long sequences.
Bidirectionality is enforced by running two parallel liquid layers: one propagates forward from (origin) to (asymptotic region), while the other propagates backward from to . At each spatial position, the hidden states from both passes are merged, ensuring explicit conditioning on both Dirichlet and asymptotic boundary conditions. This design is particularly well-suited to radial Schrödinger problems, where both boundary behaviors are formally required.
2. Mathematical Formulation and Internal Dynamics
Spatial coordinates are mapped to a dimensionless phase-space form: , where is the wave number. The network operates on discretized values (). At each , the forward and backward hidden states, , are updated according to:
with an analogous update for propagating in the reverse direction. The concatenated hidden state feeds into a fully connected combiner and decoder, yielding real and imaginary wave function components
where , are learned weights, and , are biases. All operations are differentiable by construction.
3. Feature Encoding and Physics-Informed Inputs
Each spatial point is associated with a nine-dimensional feature vector: where:
- are real and imaginary parts of the local optical potential (scaled by projectile energy ),
- is the Sommerfeld parameter,
- is the accumulated semiclassical phase ,
- is a semiclassical absorption factor,
- is the partial wave, and the target mass, normalized to their maxima.
All features are pre-normalized and processed through a two-layer encoder MLP (ReLU activations), yielding the high-dimensional representation used in the liquid layers.
4. Network Training, Parameterization, and Differentiability
The BiLNN as implemented employs liquid neurons per direction, exploiting approximately 50% sparse connectivity, with a total parameter count near . Training utilizes approximately Numerov-computed solutions spanning 12 nuclei (), , and MeV for both protons and neutrons, discretized to spatial points per wave function.
The objective function is mean-squared error over all spatial points and samples: optimized using AdamW for 500 epochs with weight decay regularization. Training converges within 3–4 hours on a single GPU. The completed model acts as a fully analytic, differentiable surrogate: , supporting automatic gradient computation essential for downstream optimization and uncertainty propagation.
5. Phase-Space Coordinate Normalization and Generalization Principle
BiLNN’s phase-space normalization, , ensures that oscillatory structure induced by the de Broglie wavelength is mapped to a universal period in -space ( for all ). This eliminates the need for the network to learn a priori a continuum of energy-dependent wavelengths, simplifying the learning task and enabling a single model to generalize across to —a variation in if operating in -space.
6. Accuracy, Physical Observables, and Generalization Performance
On held-out data, BiLNN achieves a root-mean-square relative wave function error of overall. Error rates by partial wave are approximately over –$16$, rising to at and . Across targets, errors remain within $0.8$– for and up to for the heaviest nuclei (). For projectile energies , errors are , dropping to for .
Physical observables computed from the predicted wave functions include elastic -matrix elements: as well as elastic scattering cross sections for protons and for neutrons. The model reliably recovers diffraction minima spanning four orders of magnitude in cross section, with a root-mean-square cross-section error of .
Crucially, BiLNN successfully extrapolates to nuclei excluded from training (e.g., Mg, Cu, W), maintaining wave function errors and observable fidelity comparable to in-sample targets. An ablation replacing bidirectional recurrence with a forward-only liquid layer increases wave function error from to (approximately relative degradation), especially for high and near the domain boundary, substantiating the architectural necessity for bidirectionality in enforcing dual boundary conditions.
7. Applications and Significance
BiLNN affords a differentiable, physics-informed surrogate for nuclear wave function computation, facilitating gradient-based optical-model parameter optimization and uncertainty quantification. Its design, grounded in phase-space normalization and bidirectional, ODE-driven recurrence, generalizes across a broad spectrum of projectile energies, partial waves, and nuclear targets. The demonstrated extrapolation performance suggests BiLNN has internalized the smooth dependence typical of global optical potentials (e.g., KD02), rather than simply memorizing specific cases. These properties make BiLNN a compelling candidate for integration into modern nuclear data evaluation pipelines, where rapid, differentiable, and physically accurate surrogate models are increasingly vital (Lei, 27 Dec 2025).