Bidirectional Liquid Neural Networks (BiLNN)

Updated 3 January 2026

BiLNN is a continuous-time recurrent architecture that fuses learnable ODE dynamics with bidirectional propagation to enforce both Dirichlet and asymptotic boundary conditions.
It maps complex optical potential parameterizations to nuclear scattering wave functions, achieving sub‑percent error across diverse energies and nuclear species.
The differentiable design supports gradient‑based optimization and uncertainty quantification, serving as a robust surrogate in nuclear data evaluation.

A Bidirectional Liquid Neural Network (BiLNN) is a class of continuous-time recurrent neural architectures designed for the differentiable emulation of physical boundary-value problems, exemplified by its application to global nucleon-nucleus optical model calculations. BiLNNs synthesize liquid (continuous-time ODE-driven) recurrence and bidirectional propagation to provide a mapping from complex optical potential parameterizations to scattering wave functions, while satisfying physical boundary conditions and preserving analytical differentiability. The architecture enables gradient-based optimization and uncertainty quantification in nuclear modeling, producing observables with sub-percent error and demonstrating transferability across a broad parameter space, including extrapolation to untrained nuclear species (Lei, 27 Dec 2025).

1. Architectural Foundations and Relationship to Liquid/Reservoir Computing

The BiLNN architecture generalizes reservoir computing by employing learnable continuous-time ordinary differential equation (ODE) dynamics and by enforcing bidirectional recurrence tailored to physics boundary-value problems. In contrast to discrete-time @@@@1@@@@ (LSTM/GRU), the BiLNN hidden state $h(r)$ evolves according to a first-order ODE: $\frac{d h(r)}{d r} = -g(r) \odot h(r) + [1 - g(r)] \odot c(r),$ where $g(r)$ is a learned leak gate and $c(r)$ is a learned candidate drive. This ODE form admits a closed-form solution over each step $\Delta r$ , mitigating vanishing/exploding gradient pathologies on long sequences.

Bidirectionality is enforced by running two parallel liquid layers: one propagates forward from $r=0$ (origin) to $r_{\text{max}}$ (asymptotic region), while the other propagates backward from $r_{\text{max}}$ to $r=0$ . At each spatial position, the hidden states from both passes are merged, ensuring explicit conditioning on both Dirichlet and asymptotic boundary conditions. This design is particularly well-suited to radial Schrödinger problems, where both boundary behaviors are formally required.

2. Mathematical Formulation and Internal Dynamics

Spatial coordinates are mapped to a dimensionless phase-space form: $\rho = k r$ , where $k$ is the wave number. The network operates on $N$ discretized values $0 < \rho_1 < \cdots < \rho_N$ ( $\rho_N = k r_{\text{max}}$ ). At each $\rho_n$ , the forward and backward hidden states, $h^{(f)}_n, h^{(b)}_n \in \mathbb{R}^D$ , are updated according to:

$\begin{align*} g^{(f)}_n & = \sigma(W_g [h^{(f)}_{n-1}; x_n] + b_g) \in \mathbb{R}^D, \ c^{(f)}_n & = \tanh(W_c [h^{(f)}_{n-1}; x_n] + b_c) \in \mathbb{R}^D, \ h^{(f)}_n & = \exp(-g^{(f)}_n \Delta \rho) \odot h^{(f)}_{n-1} + [1 - \exp(-g^{(f)}_n \Delta \rho)] \odot c^{(f)}_n, \end{align*}$

with an analogous update for $h^{(b)}_n$ propagating in the reverse direction. The concatenated hidden state $h_n = [h^{(f)}_n; h^{(b)}_n] \in \mathbb{R}^{2D}$ feeds into a fully connected combiner and decoder, yielding real and imaginary wave function components

$y_n = [\psi_R(\rho_n);\, \psi_I(\rho_n)] = V \cdot \mathrm{ReLU}(U h_n + d) + e,$

where $U$ , $V$ are learned weights, and $d$ , $e$ are biases. All operations are differentiable by construction.

3. Feature Encoding and Physics-Informed Inputs

Each spatial point $r_n$ is associated with a nine-dimensional feature vector: $x_n = \left( \rho_n / \rho_{\max},\; V_R(r_n)/E,\; W(r_n)/E,\; \eta/\eta_{\max}, \; \sin\varphi_{\mathrm{WKB}}(r_n),\; \cos\varphi_{\mathrm{WKB}}(r_n),\; D(r_n),\; l/l_{\max},\; A/A_{\max} \right)^\top,$ where:

$V_R(r), W(r)$ are real and imaginary parts of the local optical potential (scaled by projectile energy $E$ ),
$\eta$ is the Sommerfeld parameter,
$\varphi_{\mathrm{WKB}}(r)$ is the accumulated semiclassical phase $\int_0^r k_{\text{local}}(r')\, dr'$ ,
$D(r) = \exp[-\int_0^r \operatorname{Im}\, k_{\text{local}}(r')\, dr']$ is a semiclassical absorption factor,
$l$ is the partial wave, and $A$ the target mass, normalized to their maxima.

All features are pre-normalized and processed through a two-layer encoder MLP (ReLU activations), yielding the high-dimensional representation used in the liquid layers.

4. Network Training, Parameterization, and Differentiability

The BiLNN as implemented employs $D=128$ liquid neurons per direction, exploiting approximately 50% sparse connectivity, with a total parameter count near $1.2 \times 10^6$ . Training utilizes approximately $74\,400$ Numerov-computed solutions spanning 12 nuclei ( $12 \leq A \leq 208$ ), $l=0 \ldots 30$ , and $E \in [1,200]$ MeV for both protons and neutrons, discretized to $T=100$ spatial points per wave function.

The objective function is mean-squared error over all spatial points and samples: $\mathcal{L} = \frac{1}{N T} \sum_{i=1}^N \sum_{n=1}^T \left[ (\psi_R^{\rm pred}(\rho_n) - \psi_R^{\rm true}(\rho_n))^2 + (\psi_I^{\rm pred}(\rho_n) - \psi_I^{\rm true}(\rho_n))^2 \right],$ optimized using AdamW for 500 epochs with weight decay regularization. Training converges within 3–4 hours on a single GPU. The completed model acts as a fully analytic, differentiable surrogate: $(\text{optical potential},\,E,\,l,\,A) \mapsto (u_l(r)_{\mathrm{Re}},\,u_l(r)_{\mathrm{Im}})$ , supporting automatic gradient computation essential for downstream optimization and uncertainty propagation.

5. Phase-Space Coordinate Normalization and Generalization Principle

BiLNN’s phase-space normalization, $\rho = k r$ , ensures that oscillatory structure induced by the de Broglie wavelength $\lambda = 2\pi/k$ is mapped to a universal period in $\rho$ -space ( $2\pi$ for all $E$ ). This eliminates the need for the network to learn a priori a continuum of energy-dependent wavelengths, simplifying the learning task and enabling a single model to generalize across $E=1\,\text{MeV}$ to $200\,\text{MeV}$ —a $4.5\times$ variation in $\lambda$ if operating in $r$ -space.

6. Accuracy, Physical Observables, and Generalization Performance

On held-out data, BiLNN achieves a root-mean-square relative wave function error of $1.2\%$ overall. Error rates by partial wave $l$ are approximately $1.0\%$ over $l=8$ –$16$, rising to $1.8\%$ at $l\approx 0$ and $l\approx 30$ . Across targets, errors remain within $0.8$– $1.3\%$ for $A\leq 120$ and up to $1.8\%$ for the heaviest nuclei ( $A=197,208$ ). For projectile energies $E < 20\,\text{MeV}$ , errors are $\sim 4\%$ , dropping to $\sim 1\%$ for $E > 100\,\text{MeV}$ .

Physical observables computed from the predicted wave functions include elastic $S$ -matrix elements: $f_l = -\frac{1}{E} \int_0^\infty F_l(\eta, kr) V(r) \psi_l(r)\, dr, \quad S_l = 1 + 2ik f_l,$ as well as elastic scattering cross sections $d\sigma/d\sigma_{\text{Ruth}}$ for protons and $d\sigma/d\Omega$ for neutrons. The model reliably recovers diffraction minima spanning four orders of magnitude in cross section, with a root-mean-square cross-section error of $\mathcal{O}(1\!-\!2\%)$ .

Crucially, BiLNN successfully extrapolates to nuclei excluded from training (e.g., $^{24}$ Mg, $^{63}$ Cu, $^{184}$ W), maintaining wave function errors $\lesssim 1.5\%$ and observable fidelity comparable to in-sample targets. An ablation replacing bidirectional recurrence with a forward-only liquid layer increases wave function error from $1.2\%$ to $1.4\%$ (approximately $14\%$ relative degradation), especially for high $l$ and near the domain boundary, substantiating the architectural necessity for bidirectionality in enforcing dual boundary conditions.

7. Applications and Significance

BiLNN affords a differentiable, physics-informed surrogate for nuclear wave function computation, facilitating gradient-based optical-model parameter optimization and uncertainty quantification. Its design, grounded in phase-space normalization and bidirectional, ODE-driven recurrence, generalizes across a broad spectrum of projectile energies, partial waves, and nuclear targets. The demonstrated extrapolation performance suggests BiLNN has internalized the smooth $A,Z$ dependence typical of global optical potentials (e.g., KD02), rather than simply memorizing specific cases. These properties make BiLNN a compelling candidate for integration into modern nuclear data evaluation pipelines, where rapid, differentiable, and physically accurate surrogate models are increasingly vital (Lei, 27 Dec 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Bidirectional Neural Networks for Global Nucleon-Nucleus Optical Model Calculations (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bidirectional Liquid Neural Networks (BiLNN).