Papers
Topics
Authors
Recent
Search
2000 character limit reached

Symplectomorphism Networks

Updated 22 May 2026
  • Symplectomorphism Networks are neural architectures that incorporate symplectic constraints to preserve the geometric structure of Hamiltonian dynamical systems.
  • They leverage compositional shear maps and structured neural modules to enforce invariants and achieve universal approximation of symplectic flows.
  • Empirical evaluations show these networks outperform conventional methods in maintaining energy conservation and long-term stability in diverse dynamical systems.

A symplectomorphism network is a neural network architecture constructed to exactly preserve the canonical symplectic structure of Hamiltonian dynamical systems. By embedding symplectic priors and constraints directly into the architecture, these networks serve as data-driven surrogates or integrators for both separable and nonseparable Hamiltonian systems, exhibiting long-term stability, conservation of geometric invariants, and robust generalization properties. The main classes of symplectomorphism networks include SympNets, Nonseparable Symplectic Neural Networks (NSSNNs), and more recent geometric variants such as Symplectic Gyroceptrons, all of which are developed to respect the symplectic form under flow and thus inherit structural conservation properties that are fundamental in Hamiltonian mechanics (Xiong et al., 2020, Tapley, 2024, Duruisseaux et al., 2022, Jin et al., 2020).

1. Mathematical Foundations and Symplectic Structure

Hamiltonian systems evolving on phase space R2n\mathbb{R}^{2n} are governed by a symplectic form ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i, or in matrix notation ω(u,v)=uTJv\omega(u,v) = u^T J v, with J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}. The ODEs of motion take the canonical form z˙=J∇H(z)\dot z = J \nabla H(z) for z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}. The time-hh flow map ϕhH\phi^H_h of a Hamiltonian HH is a symplectomorphism: it exactly preserves the symplectic form, i.e., (DϕhH(z))TJDϕhH(z)=J(D\phi^H_h(z))^T J D\phi^H_h(z) = J for all ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i0.

Preservation of ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i1 is crucial for long-term qualitative fidelity of numerical or learned solutions—standard neural networks lack this structure, leading to secular drifts in energy or geometry. Symplectomorphism networks are designed so that each network map ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i2 is a symplectomorphism, ensuring ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i3 throughout training and deployment (Tapley, 2024, Jin et al., 2020, Xiong et al., 2020).

2. Symplectomorphism Network Architectures

2.1 Compositional Shear-Based Models (SympNets and P-SympNets)

SympNets and their polynomial (P-SympNet) variants use compositions of exactly symplectic "shear" maps, where each layer is implemented as the time-ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i4 flow of a parameterized Hamiltonian ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i5, with flows computable in closed form:

ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i6

A network ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i7 is then the composition of ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i8 such flows. The construction ensures each layer, and thus the full network, is symplectic.

P-SympNets restrict each basis Hamiltonian to be a polynomial ridge function,

ω=∑i=1ndpi∧dqi\omega = \sum_{i=1}^n dp_i \wedge dq_i9

which allows exact representation of all linear symplectic maps with at most ω(u,v)=uTJv\omega(u,v) = u^T J v0 quadratic shear layers, and universal approximation of arbitrary polynomial (and more generally ω(u,v)=uTJv\omega(u,v) = u^T J v1) symplectic flows (Tapley, 2024).

2.2 Structured Neural Map Compositions (LA-SympNet, G-SympNet)

SympNets as introduced in (Jin et al., 2020) employ alternating blocks of analytically symplectic linear maps ("up" and "low") and symplectic nonlinear activations or gradient modules. These are essentially unitriangular symplectic blocks, each preserving the canonical two-form. The LA-SympNet alternates linear and nonlinear blocks, whereas G-SympNet uses direct gradient modules to approximate Hamiltonian gradients.

A minimal structure:

  • Linear block: ω(u,v)=uTJv\omega(u,v) = u^T J v2 (symmetric part), ω(u,v)=uTJv\omega(u,v) = u^T J v3 constructed so ω(u,v)=uTJv\omega(u,v) = u^T J v4.
  • Nonlinear activation: ω(u,v)=uTJv\omega(u,v) = u^T J v5, again constructed to be symplectic. These blocks compose into a depth-ω(u,v)=uTJv\omega(u,v) = u^T J v6 map that remains symplectic by construction (Jin et al., 2020).

2.3 Nonseparable Hamiltonian Networks (NSSNN)

The NSSNN framework targets nonseparable systems ω(u,v)=uTJv\omega(u,v) = u^T J v7 where kinetic and potential energy terms are inherently coupled. The base Hamiltonian model is a fully connected feed-forward network with 6 layers (width 64, sigmoid activations). The system's state is augmented to ω(u,v)=uTJv\omega(u,v) = u^T J v8, and advanced by a composition of three second-order symmetric splitting maps, each a symplectomorphism. The maps ω(u,v)=uTJv\omega(u,v) = u^T J v9, J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}0, and J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}1 are explicitly implemented via automatic differentiation, and their interleaved composition ensures global preservation of J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}2 (Xiong et al., 2020).

2.4 Symplectic Gyroceptrons for Nearly-Periodic Maps

Symplectic gyroceptrons approximate nearly-periodic, parameter-dependent symplectic maps on presymplectic manifolds. The architecture factors the learned symplectic diffeomorphism as:

J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}3

with J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}4 and J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}5 as compositions of (near-identity) Hénon layers, each being an explicit symplectomorphism. This structure enables the preservation of J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}6 symmetries and discrete-time adiabatic invariants, critical for long-time stability in nearly-integrable systems (Duruisseaux et al., 2022).

3. Universal Approximation Theory and Representation Properties

SympNets, P-SympNets, and related architectures are proven to be universal approximators for the space of J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}7 symplectic diffeomorphisms on compact sets, provided the span of basis Hamiltonians is dense. Explicitly, for any target symplectomorphism and any J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}8, there exist layer parameters such that the composite map approximates the true flow to accuracy J=(0In −In0)J = \begin{pmatrix}0 & I_n \ -I_n & 0\end{pmatrix}9 uniformly (Tapley, 2024, Jin et al., 2020).

For linear/quadratic systems, P-SympNets can represent any symplectic map z˙=J∇H(z)\dot z = J \nabla H(z)0 as the product of at most z˙=J∇H(z)\dot z = J \nabla H(z)1 quadratic layers. If z˙=J∇H(z)\dot z = J \nabla H(z)2 for symmetric z˙=J∇H(z)\dot z = J \nabla H(z)3, only z˙=J∇H(z)\dot z = J \nabla H(z)4 layers suffice. This has been analytically proven based on classical symplectic matrix factorization results (Jin–Lin–Xiao). A plausible implication is highly efficient exact surrogates for large-scale linear Hamiltonian systems (Tapley, 2024).

4. Structure-Preserving Training and Optimization

In all symplectomorphism networks, the symplectic property is enforced structurally at the layer level; no projection or Jacobian regularizer is required. Loss functions are standard regression objectives, typically mean squared error (MSE) between the predicted flow and ground-truth integrator outputs:

z˙=J∇H(z)\dot z = J \nabla H(z)5

In NSSNN, the loss sums z˙=J∇H(z)\dot z = J \nabla H(z)6 errors over both the original and auxiliary variables, with the binding term induced by z˙=J∇H(z)\dot z = J \nabla H(z)7 acting as a stabilizer (Xiong et al., 2020). Symplectic gyroceptrons similarly use a standard output loss, as their architecture ensures all formal symmetries are preserved by design (Duruisseaux et al., 2022). Empirically, optimization proceeds with Adam or similar optimizers, and hyperparameters are consistent with conventional deep learning practice.

A non-vanishing gradient property holds for compositions of symplectic layers; the Jacobian matrices of symplectomorphisms have all singular values at least 1, so deep SympNet architectures avoid gradient collapse as depth increases (Tapley, 2024).

5. Performance, Empirical Evaluation, and Practical Use Cases

Extensive evaluations demonstrate the advantages of symplectomorphism networks:

  • Separable and Nonseparable Systems: NSSNNs achieve the lowest long-term trajectory and energy errors zË™=J∇H(z)\dot z = J \nabla H(z)8 and zË™=J∇H(z)\dot z = J \nabla H(z)9 across separable systems (pendulum, Lotka–Volterra, harmonic spring) and nonseparable systems (Hénon–Heiles, Fourier-truncated nonlinear Schrödinger), maintaining robustness with 5–40% training noise. Notably, they separate z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}0-vortex flows where HNN and naive NeuralODE baselines fail (Xiong et al., 2020).
  • Universal Surrogates: SympNets deliver high accuracy (MSE z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}1–z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}2 vs.\ z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}3–z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}4 for alternatives for equal or fewer parameters), are effective for both regular and irregular data, and successfully approximate high-dimensional and chaotic regimes (Fermi–Pasta–Ulam, double pendulum, three-body) (Tapley, 2024, Jin et al., 2020).
  • Efficiency and Scalability: Symplectic gyroceptrons enable surrogate models operating on timescales z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}5 with negligible drift of adiabatic invariants, significantly accelerating simulation of slow–fast systems—z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}6 faster than classical RK4 in some multiscale benchmarks (Duruisseaux et al., 2022).
  • Symbolic Regression: P-SympNets, combined with backward error analysis, recover symbolic forms of polynomial Hamiltonians to coefficient mean absolute error below z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}7 for moderate polynomial degrees (Tapley, 2024).

Summary of Selected Empirical Results:

System / Metric Network Type Example Error / Property
Double pendulum, MSE LA-SympNet z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}8
N-body vortex, trajectory NSSNN Faithful long-term separation, no collapse
Charged oscillator Gyroceptron Adiabatic invariant error z=(p,q)∈R2nz = (p, q) \in \mathbb{R}^{2n}9
Linear, high-dim P-SympNet Machine precision recovery

6. Invariants, Symmetries, and Long-Time Properties

Enforcing symplectomorphism guarantees preservation of geometric invariants such as energy and adiabatic invariants up to the order of discretization or architecture-induced error. For nearly-periodic systems, the symplectic gyroceptron architecture admits hh0 rotational symmetry to all orders in hh1, and by a formal Noether's theorem construction it yields discrete-time adiabatic invariants that are non-secularly drifting over extremely long integration times (Duruisseaux et al., 2022). In backward error analysis, the true motion of a learned symplectomorphism network is governed by a modified Hamiltonian that agrees with the original up to calculated higher-order terms, making explicit the source and order of any drift (Tapley, 2024).

7. Extensions, Limitations, and Outlook

Symplectomorphism networks have been extended to accommodate variable time steps and sparse or irregular data by parameterizing the flow with respect to step size, and can recover vector fields directly from learned maps. All current architectures are a priori symplectic; no post hoc symplectification is required. Some limitations remain for systems with significant non-Hamiltonian perturbations or in dissipative/noncanonical phase space settings, but the universal approximation results and empirical scalability suggest broad applicability in geometric machine learning, large-scale surrogate modeling, and symbolic regression of dynamical systems (Tapley, 2024, Jin et al., 2020, Duruisseaux et al., 2022, Xiong et al., 2020).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Symplectomorphism Networks.