Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bidirectional Liquid Neural Networks (BiLNN)

Updated 3 January 2026
  • BiLNN is a continuous-time recurrent architecture that fuses learnable ODE dynamics with bidirectional propagation to enforce both Dirichlet and asymptotic boundary conditions.
  • It maps complex optical potential parameterizations to nuclear scattering wave functions, achieving sub‑percent error across diverse energies and nuclear species.
  • The differentiable design supports gradient‑based optimization and uncertainty quantification, serving as a robust surrogate in nuclear data evaluation.

A Bidirectional Liquid Neural Network (BiLNN) is a class of continuous-time recurrent neural architectures designed for the differentiable emulation of physical boundary-value problems, exemplified by its application to global nucleon-nucleus optical model calculations. BiLNNs synthesize liquid (continuous-time ODE-driven) recurrence and bidirectional propagation to provide a mapping from complex optical potential parameterizations to scattering wave functions, while satisfying physical boundary conditions and preserving analytical differentiability. The architecture enables gradient-based optimization and uncertainty quantification in nuclear modeling, producing observables with sub-percent error and demonstrating transferability across a broad parameter space, including extrapolation to untrained nuclear species (Lei, 27 Dec 2025).

1. Architectural Foundations and Relationship to Liquid/Reservoir Computing

The BiLNN architecture generalizes reservoir computing by employing learnable continuous-time ordinary differential equation (ODE) dynamics and by enforcing bidirectional recurrence tailored to physics boundary-value problems. In contrast to discrete-time @@@@1@@@@ (LSTM/GRU), the BiLNN hidden state h(r)h(r) evolves according to a first-order ODE: dh(r)dr=g(r)h(r)+[1g(r)]c(r),\frac{d h(r)}{d r} = -g(r) \odot h(r) + [1 - g(r)] \odot c(r), where g(r)g(r) is a learned leak gate and c(r)c(r) is a learned candidate drive. This ODE form admits a closed-form solution over each step Δr\Delta r, mitigating vanishing/exploding gradient pathologies on long sequences.

Bidirectionality is enforced by running two parallel liquid layers: one propagates forward from r=0r=0 (origin) to rmaxr_{\text{max}} (asymptotic region), while the other propagates backward from rmaxr_{\text{max}} to r=0r=0. At each spatial position, the hidden states from both passes are merged, ensuring explicit conditioning on both Dirichlet and asymptotic boundary conditions. This design is particularly well-suited to radial Schrödinger problems, where both boundary behaviors are formally required.

2. Mathematical Formulation and Internal Dynamics

Spatial coordinates are mapped to a dimensionless phase-space form: ρ=kr\rho = k r, where kk is the wave number. The network operates on NN discretized values 0<ρ1<<ρN0 < \rho_1 < \cdots < \rho_N (ρN=krmax\rho_N = k r_{\text{max}}). At each ρn\rho_n, the forward and backward hidden states, hn(f),hn(b)RDh^{(f)}_n, h^{(b)}_n \in \mathbb{R}^D, are updated according to:

gn(f)=σ(Wg[hn1(f);xn]+bg)RD, cn(f)=tanh(Wc[hn1(f);xn]+bc)RD, hn(f)=exp(gn(f)Δρ)hn1(f)+[1exp(gn(f)Δρ)]cn(f),\begin{align*} g^{(f)}_n & = \sigma(W_g [h^{(f)}_{n-1}; x_n] + b_g) \in \mathbb{R}^D, \ c^{(f)}_n & = \tanh(W_c [h^{(f)}_{n-1}; x_n] + b_c) \in \mathbb{R}^D, \ h^{(f)}_n & = \exp(-g^{(f)}_n \Delta \rho) \odot h^{(f)}_{n-1} + [1 - \exp(-g^{(f)}_n \Delta \rho)] \odot c^{(f)}_n, \end{align*}

with an analogous update for hn(b)h^{(b)}_n propagating in the reverse direction. The concatenated hidden state hn=[hn(f);hn(b)]R2Dh_n = [h^{(f)}_n; h^{(b)}_n] \in \mathbb{R}^{2D} feeds into a fully connected combiner and decoder, yielding real and imaginary wave function components

yn=[ψR(ρn);ψI(ρn)]=VReLU(Uhn+d)+e,y_n = [\psi_R(\rho_n);\, \psi_I(\rho_n)] = V \cdot \mathrm{ReLU}(U h_n + d) + e,

where UU, VV are learned weights, and dd, ee are biases. All operations are differentiable by construction.

3. Feature Encoding and Physics-Informed Inputs

Each spatial point rnr_n is associated with a nine-dimensional feature vector: xn=(ρn/ρmax,  VR(rn)/E,  W(rn)/E,  η/ηmax,  sinφWKB(rn),  cosφWKB(rn),  D(rn),  l/lmax,  A/Amax),x_n = \left( \rho_n / \rho_{\max},\; V_R(r_n)/E,\; W(r_n)/E,\; \eta/\eta_{\max}, \; \sin\varphi_{\mathrm{WKB}}(r_n),\; \cos\varphi_{\mathrm{WKB}}(r_n),\; D(r_n),\; l/l_{\max},\; A/A_{\max} \right)^\top, where:

  • VR(r),W(r)V_R(r), W(r) are real and imaginary parts of the local optical potential (scaled by projectile energy EE),
  • η\eta is the Sommerfeld parameter,
  • φWKB(r)\varphi_{\mathrm{WKB}}(r) is the accumulated semiclassical phase 0rklocal(r)dr\int_0^r k_{\text{local}}(r')\, dr',
  • D(r)=exp[0rImklocal(r)dr]D(r) = \exp[-\int_0^r \operatorname{Im}\, k_{\text{local}}(r')\, dr'] is a semiclassical absorption factor,
  • ll is the partial wave, and AA the target mass, normalized to their maxima.

All features are pre-normalized and processed through a two-layer encoder MLP (ReLU activations), yielding the high-dimensional representation used in the liquid layers.

4. Network Training, Parameterization, and Differentiability

The BiLNN as implemented employs D=128D=128 liquid neurons per direction, exploiting approximately 50% sparse connectivity, with a total parameter count near 1.2×1061.2 \times 10^6. Training utilizes approximately 7440074\,400 Numerov-computed solutions spanning 12 nuclei (12A20812 \leq A \leq 208), l=030l=0 \ldots 30, and E[1,200]E \in [1,200] MeV for both protons and neutrons, discretized to T=100T=100 spatial points per wave function.

The objective function is mean-squared error over all spatial points and samples: L=1NTi=1Nn=1T[(ψRpred(ρn)ψRtrue(ρn))2+(ψIpred(ρn)ψItrue(ρn))2],\mathcal{L} = \frac{1}{N T} \sum_{i=1}^N \sum_{n=1}^T \left[ (\psi_R^{\rm pred}(\rho_n) - \psi_R^{\rm true}(\rho_n))^2 + (\psi_I^{\rm pred}(\rho_n) - \psi_I^{\rm true}(\rho_n))^2 \right], optimized using AdamW for 500 epochs with weight decay regularization. Training converges within 3–4 hours on a single GPU. The completed model acts as a fully analytic, differentiable surrogate: (optical potential,E,l,A)(ul(r)Re,ul(r)Im)(\text{optical potential},\,E,\,l,\,A) \mapsto (u_l(r)_{\mathrm{Re}},\,u_l(r)_{\mathrm{Im}}), supporting automatic gradient computation essential for downstream optimization and uncertainty propagation.

5. Phase-Space Coordinate Normalization and Generalization Principle

BiLNN’s phase-space normalization, ρ=kr\rho = k r, ensures that oscillatory structure induced by the de Broglie wavelength λ=2π/k\lambda = 2\pi/k is mapped to a universal period in ρ\rho-space (2π2\pi for all EE). This eliminates the need for the network to learn a priori a continuum of energy-dependent wavelengths, simplifying the learning task and enabling a single model to generalize across E=1MeVE=1\,\text{MeV} to 200MeV200\,\text{MeV}—a 4.5×4.5\times variation in λ\lambda if operating in rr-space.

6. Accuracy, Physical Observables, and Generalization Performance

On held-out data, BiLNN achieves a root-mean-square relative wave function error of 1.2%1.2\% overall. Error rates by partial wave ll are approximately 1.0%1.0\% over l=8l=8–$16$, rising to 1.8%1.8\% at l0l\approx 0 and l30l\approx 30. Across targets, errors remain within $0.8$–1.3%1.3\% for A120A\leq 120 and up to 1.8%1.8\% for the heaviest nuclei (A=197,208A=197,208). For projectile energies E<20MeVE < 20\,\text{MeV}, errors are 4%\sim 4\%, dropping to 1%\sim 1\% for E>100MeVE > 100\,\text{MeV}.

Physical observables computed from the predicted wave functions include elastic SS-matrix elements: fl=1E0Fl(η,kr)V(r)ψl(r)dr,Sl=1+2ikfl,f_l = -\frac{1}{E} \int_0^\infty F_l(\eta, kr) V(r) \psi_l(r)\, dr, \quad S_l = 1 + 2ik f_l, as well as elastic scattering cross sections dσ/dσRuthd\sigma/d\sigma_{\text{Ruth}} for protons and dσ/dΩd\sigma/d\Omega for neutrons. The model reliably recovers diffraction minima spanning four orders of magnitude in cross section, with a root-mean-square cross-section error of O(1 ⁣ ⁣2%)\mathcal{O}(1\!-\!2\%).

Crucially, BiLNN successfully extrapolates to nuclei excluded from training (e.g., 24^{24}Mg, 63^{63}Cu, 184^{184}W), maintaining wave function errors 1.5%\lesssim 1.5\% and observable fidelity comparable to in-sample targets. An ablation replacing bidirectional recurrence with a forward-only liquid layer increases wave function error from 1.2%1.2\% to 1.4%1.4\% (approximately 14%14\% relative degradation), especially for high ll and near the domain boundary, substantiating the architectural necessity for bidirectionality in enforcing dual boundary conditions.

7. Applications and Significance

BiLNN affords a differentiable, physics-informed surrogate for nuclear wave function computation, facilitating gradient-based optical-model parameter optimization and uncertainty quantification. Its design, grounded in phase-space normalization and bidirectional, ODE-driven recurrence, generalizes across a broad spectrum of projectile energies, partial waves, and nuclear targets. The demonstrated extrapolation performance suggests BiLNN has internalized the smooth A,ZA,Z dependence typical of global optical potentials (e.g., KD02), rather than simply memorizing specific cases. These properties make BiLNN a compelling candidate for integration into modern nuclear data evaluation pipelines, where rapid, differentiable, and physically accurate surrogate models are increasingly vital (Lei, 27 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bidirectional Liquid Neural Networks (BiLNN).