LLoCa-Transformer: Exact Lorentz Equivariance
- LLoCa-Transformer is a neural architecture that provides exact Lorentz equivariance by canonicalizing features into learned local reference frames.
- It integrates standard neural layers with equivariant frame prediction and tensorial message passing to efficiently handle space-time tensor data.
- Empirical results demonstrate improved performance and computational efficiency on jet tagging and QFT amplitude regression benchmarks.
The LLoCa-Transformer is a generic neural architecture designed to endow any backbone—such as transformers and graph neural networks—with exact Lorentz equivariance. Built on the Lorentz Local Canonicalization (LLoCa) framework, it operates by learning Lorentz-equivariant local reference frames for each entity (e.g., particle) within the input, canonicalizing features into these local frames, and then leveraging standard neural layers. This approach enables seamless propagation of space-time tensorial information while eliminating the architectural constraints of previous Lorentz-equivariant models, and achieves state-of-the-art accuracy and computational efficiency on challenging high-energy physics benchmarks (Spinner et al., 26 May 2025, Favaro et al., 20 Aug 2025).
1. Theoretical Foundations of LLoCa-Transformer
Lorentz symmetry underpins fundamental interactions in high-energy physics, where observed data such as four-momenta transform under the proper orthochronous Lorentz group . Traditional Lorentz-equivariant neural networks deploy bespoke convolution or message-passing layers, severely limiting flexibility. LLoCa overcomes this limitation by decoupling equivariance from architectural constraints.
The central construct is the prediction, for each input object , of an equivariant local reference frame such that under a global Lorentz transformation ,
and any subsequent canonicalization into the local frame transforms as
rendering exactly invariant under .
By expressing all physics features in these locally canonicalized variables before feeding them through arbitrary neural backbone layers, the output can finally be de-canonicalized to restore the correct equivariant transformation law. This construction guarantees exact equivariance at negligible overhead and removes the requirement for special Lorentz layers (Spinner et al., 26 May 2025).
2. Architecture and Algorithmic Structure
The LLoCa-transformer comprises four operational components:
- Equivariant Frame Prediction: For particles with input features (typically four-momenta and optional scalars), a Frames-Net predicts three four-vectors per object:
Here, is a small MLP on Lorentz scalars, and denotes the Minkowski product.
From , is constructed via a polar decomposition: (a boost from ), Gram–Schmidt for using and , yielding .
- Canonicalization: For any feature , canonicalization is performed as:
For tensorial objects, the appropriate group representation is used:
This ensures all features processed by the transformer are Lorentz-invariant.
- Standard Transformer Stack on Canonicalized Features: The canonicalized inputs enter an unmodified transformer or neural backbone, with linear key/query/value projections, multi-head self-attention, feed-forward layers, and residual norms. Equivariance is maintained throughout, as all operations act on Lorentz-invariant quantities.
- Tensorial Message Passing: Attention and message aggregation across local frames are conducted via the inter-frame transformation:
Here, all contractions use the Minkowski product. The general tensorial message-passing update is
for arbitrary maps and permutation-invariant sum .
3. Data Augmentation and the Local/Global Frame Dichotomy
By fixing (a random global Lorentz transformation) for all within an event, canonicalization reduces to traditional Lorentz data augmentation—preprocessing each event by a global transform before processing by an ordinary non-equivariant network: Thus, standard augmentation becomes a particular instance of LLoCa, while learning distinct for each object yields exact equivariance. This perspective unifies data augmentation and equivariant preprocessing (Spinner et al., 26 May 2025).
4. Empirical Performance and Ablations
LLoCa-transformers provide substantial performance gains on multiple LHC-relevant tasks:
| Task | Baseline | LLoCa-Transformer | Specialized Lorentz-GNN (L-GATr) |
|---|---|---|---|
| Jet tagging (JetClass) | 85.5% (AUC 0.9867) | 86.4% (AUC 0.9882) | Similar AUC, but 4× slower |
| QFT amp. regression | MSE (vanilla) | MSE | MSE |
Notable ablations and findings include:
- Tensorial message passing (combining scalar and vector features) is critical; all-scalar attention results in >30× higher regression MSE.
- Using the Minkowski metric within attention rather than Euclidean halves MSE.
- Frame-Net capacity requirements are low; small MLPs with dropout suffice.
- For large datasets, exact equivariance outperforms even optimized data augmentation; for very small samples, data augmentation may marginally outperform due to inductive bias.
Computationally, LLoCa-transformers achieve state-of-the-art with only – extra FLOPs and $30$– extra training time, yet remain up to faster and $5$– more efficient in forward FLOPs compared to previous state-of-the-art Lorentz-equivariant architectures (Spinner et al., 26 May 2025, Favaro et al., 20 Aug 2025).
5. Symmetry Breaking and Subgroup Equivariance
Real-world high-energy experiments often feature only partial Lorentz invariance; event-level selection and detector design typically preserve only a subgroup . LLoCa enables explicit control over which symmetries are enforced, both architecturally (by fixing vectors in the frame prediction network) and at the input level (by providing reference vectors or explicit coordinates).
Empirical results show that for tasks like event generation, restricting equivariance to suffices; for jet tagging, optimal performance is obtained only if symmetry is broken down to this subgroup. The LLoCa framework allows these breaks to be specified—or learned—by the network, making it suitable for rigorous studies of symmetry in practical collider analysis (Favaro et al., 20 Aug 2025).
6. Comparative Perspective and Applications
LLoCa-Transformer provides a universal mechanism for obtaining exact Lorentz equivariance in neural architectures for high-energy physics. Its broad compatibility ensures that any backbone—transformer, particle net, or graph network—can be "lifted" to Lorentz equivariance with only minor architectural modifications.
Key applications include:
- Jet tagging with large simulated and experimental datasets, achieving improvements in classification accuracy, AUC, and speed.
- Quantum field theory amplitude regression, outperforming all previously published equivariant GNNs by a substantial margin.
- End-to-end event generation in collider data, allowing training objectives expressed in the correct symmetry frame and facilitating fair comparisons across symmetry-breaking choices.
The ability to propagate higher-order tensorial features and to recover or exceed specialized architectures’ accuracy at a fraction of computational cost positions LLoCa-Transformer as a foundational tool in modern machine learning for collider and astroparticle physics (Spinner et al., 26 May 2025, Favaro et al., 20 Aug 2025).