Papers
Topics
Authors
Recent
Search
2000 character limit reached

SE(3)-Equivariant Tensor Field Networks

Updated 26 February 2026
  • SE(3)-Equivariant Tensor Field Networks are geometric deep learning models that enforce equivariance to 3D rotations and translations via tensor representations.
  • They employ a dual-scale flow matching framework that decomposes molecular structures into coarse and all-atom representations, enhancing computational efficiency and accuracy.
  • The architecture integrates spherical harmonics and learned tensor operations to maintain symmetry compliance, enabling high-fidelity 3D data generation for applications in molecular physics and computer vision.

An SE(3)-equivariant Tensor Field Network (TFN) is a geometric deep learning architecture designed to model and generate 3D data while guaranteeing equivariance with respect to the SE(3) group of rigid-body transformations, comprising all 3D rotations (SO(3)) and translations (ℝ³). Such networks leverage the TFN paradigm to ensure that operations on coordinates (and associated tensors) commute with global Euclidean transformations, which is essential for molecular physics, chemistry, 3D computer vision, and related applications where canonical orientation or placement in space is arbitrary.

1. Geometric Equivariance and Tensor Field Networks

SE(3) equivariance is the property whereby, for a neural operator ff, acting on input data xx transformed as x=Rx+τx' = R x + \tau (where RSO(3)R \in SO(3), τR3\tau \in \mathbb{R}^3), the output transforms as f(x)=Rf(x)f(x') = R f(x). TFNs leverage this by representing features as geometric tensors (e.g., scalars, vectors, higher-order tensors) and using learned operations (including kernel convolutions and spherical harmonics) that are strictly equivariant under SE(3).

In practice, nodes in the underlying molecular or point-cloud graph carry both invariant features (e.g., atom type, bond context) and equivariant features (coordinates, velocities). Edges encode pairwise relationships such as bond type or spatial distance.

2. Dual-Scale Flow Matching Framework

TFNs are central components in dual-scale flow matching frameworks for generative modeling of 3D structures, most notably in the generation of molecular clusters (Subramanian et al., 2024). The state space is decomposed into a coarse-grained (CG) representation, with M beads, and an all-atom (AA) representation, with N atoms. This two-stage approach exploits hierarchies in molecular structure:

  • The CG flow:
    • cRM×3c \in \mathbb{R}^{M\times 3} (bead coordinates)
    • Coarse potential UCG(c)U^{CG}(c)
    • Objective: Learn vθCG(c,t)v_\theta^{CG}(c, t) via flow matching from a simple prior to p1CG(c)exp(βUCG(c))p^{CG}_1(c) \propto \exp(-\beta U^{CG}(c))
  • The AA flow:
    • xRN×3x \in \mathbb{R}^{N\times 3} (atom coordinates)
    • Full potential UAA(x)U^{AA}(x)
    • Conditional on the generated CG configuration, learn vφAA(x,tc1)v_\varphi^{AA}(x, t \mid c_1) to sample p1AA(x)exp(βUAA(x))p^{AA}_1(x) \propto \exp(-\beta U^{AA}(x))

Both flows are modeled by SE(3)-equivariant TFNs, ensuring physical symmetry at both levels.

3. Mathematical Formulation and Loss Functions

The flow-matching objective corresponds to simulating a stochastic process along a continuous path from a simple prior to the data distribution. The core supervised regression target at each timepoint t is the known velocity field between initial and target configurations:

LFM=Et,z0,z1vθ(zt,t)(z1z0)2\mathcal{L}_\text{FM} = \mathbb{E}_{t, z_0, z_1} \left\| v_\theta(z_t, t) - (z_1 - z_0) \right\|^2

where zt=(1t)z0+tz1z_t=(1-t)z_0 + t z_1.

For dual-scale flows:

  • CG flow: LCG\mathcal{L}_{\text{CG}} as above on beads
  • AA flow: LAA\mathcal{L}_{\text{AA}} conditioned on c1c_1

The flows are solved via explicit ODE integration; TFN architectures parameterize the time-dependent velocity fields.

4. Architectural Characteristics and Equivariance Enforcement

The SE(3)-equivariant TFN backbone operates on molecular or point-cloud graphs with the following essentials:

  • Nodes: Carry both 3D coordinates (equivariant) and feature vectors (invariant, e.g., atom type, aromaticity).
  • Edges: Encode relationships such as bond type (invariant).
  • Layers: Employ learned spherical harmonics and tensor products to ensure that for any rigid transformation (R,τ)(R, \tau), the network satisfies

vθ(Rc+τ,t)=Rvθ(c,t)v_\theta(R c + \tau, t) = R v_\theta(c, t)

and, in the AA stage, vφ(Rx+τ,tRc1+τ)=Rvφ(x,tc1)v_\varphi(R x + \tau, t \mid R c_1 + \tau) = R v_\varphi(x, t \mid c_1).

  • Alternatives: Benchmarked backbones include E(3)-GNN and Attentive FP; TFN yields superior Jensen–Shannon divergence for physically realistic distributions.

5. Training and Inference Pipeline

Training proceeds in two decoupled stages:

  1. CG flow training: Minimize LCG\mathcal{L}_{\text{CG}} via stochastic sampling of CG configurations.
  2. AA flow training: Minimize LAA\mathcal{L}_{\text{AA}}, conditioning on ground-truth bead coordinates. This separation allows for efficient learning and wall-clock speedups: most ODE integration steps are performed on the much smaller CG system.

At inference, a CG sample is first drawn and integrated via the CG flow. The resulting bead coordinates condition the AA flow, generating the full atomistic sample efficiently and equivariantly.

6. Empirical Results and Computational Advantages

Dual-scale SE(3)-equivariant TFN flow matching achieves substantial gains in both fidelity and computational cost over single-scale or non-equivariant alternatives (Subramanian et al., 2024):

Method Bond JSD Angle JSD ↓ Time/step (s) ↓
Single-scale (Gaussian) 0.6563 0.6316 0.2949
Single-scale (Harmonic) 0.6298 0.6066 0.3039
Dual-scale (30:10 split) 0.5472 0.4610 0.0496

Increasing the proportion of CG steps further decreases inference time with negligible fidelity loss. This efficiency arises from performing most ODE integration at the coarse level (M ≪ N), while SE(3)-equivariance preserves physically correct generation.

7. Relevance and Context within SE(3)-Equivariant Modeling

SE(3)-equivariant TFNs have become the de facto backbone for generative and discriminative geometric learning tasks where data is physically non-oriented and translation-invariant. The dual-scale framework, as operationalized for molecular sampling, enables accurate, efficient simulations that are unattainable by single-scale or non-equivariant methods. Direct enforcement of SE(3)-equivariance via TFN layers ensures compliance with conservation laws and indistinguishability under global rigid motions—a strict requirement in molecular, physical, and some 3D perception applications. TFN-equipped flow matching methods set a new standard for generative 3D modeling in these domains (Subramanian et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SE(3)-Equivariant Tensor Field Network.