Papers
Topics
Authors
Recent
2000 character limit reached

AutoEFT: Automated EFT Operator Generation

Updated 17 December 2025
  • AutoEFT is a fully automated framework that constructs complete, minimal on-shell operator bases for effective field theories by integrating group-theory and tensor-algebra algorithms.
  • It systematically removes redundancies using equations of motion, integration-by-parts, and Fierz/Schouten identities to ensure only unique Lorentz and gauge-invariant interactions are included.
  • The framework integrates with transformer-based models to expedite the generation and verification of candidate operators for applications in SMEFT, BSM, and gravitational theories.

AutoEFT is a fully automated framework for constructing complete, minimal on-shell operator bases for effective field theories (EFTs), given arbitrary field content and symmetry groups. It augments classical group- and tensor-theory algorithms with codified removal of all operator redundancies from equations of motion (EoM), integration-by-parts (IBP) identities, Fierz and Schouten-type relations, and field relabeling symmetries. AutoEFT operates on the on-shell Hilbert space, systematically enumerating all permitted Lorentz and gauge-invariant interactions at a specified operator dimension, and outputs the result in human- and machine-readable form, streamlining EFT basis construction for the Standard Model, its extensions, and beyond (Schaaf, 2023, Harlander et al., 2023).

1. Mathematical Foundations and Redundancy Removal

AutoEFT is predicated on the construction of operator bases without over-generation and post-hoc redundancy elimination. From the outset, only Lorentz- and gauge-invariant tensor structures not reducible by canonical identities are included.

  • Equations of Motion (EoM): Operators proportional to classical EoMs, such as δS/δΦ=0\delta S/\delta\Phi=0, are physically redundant on shell and removed by construction. For gauge fields, any structure involving total derivatives like ∂μFμν−Jν=0\partial^\mu F_{\mu\nu}-J_\nu=0 is excluded.
  • Integration by Parts (IBP): Total derivatives, ∫d4x ∂μOμ(x)=0\int d^4x\,\partial_\mu\mathcal{O}^\mu(x) = 0, do not affect SS-matrix elements and are pruned by restricting the placement of derivatives in operator generation.
  • Fierz and Schouten Identities: Bilinear and multilinear spinor products related by Fierz transformations are collapsed to canonical representatives. For example:

(ψˉ1γμψ2)(ψˉ3γμψ4)=−2(ψˉ1ψ4)(ψˉ3ψ2)+…(\bar\psi_1\gamma^\mu\psi_2) (\bar\psi_3\gamma_\mu\psi_4) = -2(\bar\psi_1\psi_4)(\bar\psi_3\psi_2) + \dots

Similarly, SU(NN) group-theoretical redundancies (e.g., Schouten for SU(2): ϵijϵkl+ϵikϵlj+ϵilϵjk=0\epsilon^{ij}\epsilon^{kl}+\epsilon^{ik}\epsilon^{lj}+\epsilon^{il}\epsilon^{jk}=0) are enforced to avoid duplicate invariants (Schaaf, 2023).

  • Repeated-Field Redundancy: For operators with multiple identical fields, permutation symmetries (field relabeling) are quotiented out after generating a "super-basis" treating fields as distinct.

2. Core Algorithm and Implementation

The central AutoEFT workflow follows a deterministic, pruning-centric approach:

  1. Model Parsing: The user supplies a YAML model file specifying the field content (spin, Lorentz/gauge reps, hypercharges, number of generations) and symmetry groups (arbitrary SU(NN), U(1), local/global).
  2. Basis Generation:
    • Enumerate all multi-sets of fields and derivatives such that the total mass dimension matches the target.
    • For each such "family," build all Lorentz-invariant tensors (e.g., via Young tableau and epsilon/trace contractions) and internal symmetry tensors.
  3. Redundancy Elimination:
    • Prune structures forbidden by EoM/IBP at the time of monomial creation.
    • Apply Fierz/Schouten reductions for spinor and group-theoretical cases.
    • Implement symmetric group decompositions to identify and mod out field permutation symmetries (using precomputed Sn_n generators for n≤9n\leq 9).
  4. Export and Formatting: The resulting operator basis is written in YAML, JSON, and LaTeX forms, including explicit index contractions and permutation symmetries (Harlander et al., 2023).

Pseudocode summary:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def AutoEFT_Basis(dimension d, ModelFile model):
    fields, symmetry_groups = ParseModel(model)
    families = []
    for multiset F of fields with sum(dim) ≤ d:
        for each way to assign #∂ to F with sum(dim)+#∂=d:
            if EoM_forbidden_pattern(F, ∂-assignment): continue
            if IBP_total_derivative(F, ∂-assignment): continue
            families.append((F, ∂-assignment))
    for fam in families:
        T_Lor = FindLorentzTensors(fam)
        T_Int = FindGroupInvariants(fam, symmetry_groups)
    invariants = { contract(L, I) | L in T_Lor, I in T_Int }
    phys_basis = QuotientByFieldPermutations(invariants)
    return phys_basis
(Schaaf, 2023, Harlander et al., 2023)

3. Supported Theories, Input/Output, and Extensions

AutoEFT supports an extensive range of EFTs as defined by user-supplied fields and symmetry groups:

  • Fields: Scalars, Weyl and Dirac spinors, gauge bosons (Lorentz ≤\leq spin-2, including Weyl and Riemann tensors for gravity).
  • Symmetries: SU(NN), U(1) factors, local or global; extensions to SO(NN), Sp(NN) possible via invariant tensor implementation.
  • Sample Input File: Basic YAML with fields and symmetries:
    1
    2
    3
    4
    5
    6
    7
    8
    
    name: SMEFT
    symmetries:
      u1_groups: {U1_Y: {}}
      sun_groups: {SU3_C: {N: 3}, SU2_L: {N: 2}}
    fields:
      QL: {representations: {Lorentz:[1/2,0], SU3_C:[1], SU2_L:[1]}, generations: 3}
      H: {representations: {Lorentz:[0], SU2_L:[1]}, tex: H}
      # Additional fields as needed
  • Outputs: Per-dimension operator catalog (YAML/JSON), LaTeX for each operator, machine-readable basis structure. For example, the dimension-5 Weinberg operator for neutrino mass is output in both YAML and LaTeX:

Oν=ϵαβ ϵij(Li αw Lj βx)(Hk Hl) ϵkl\mathcal{O}_\nu = \epsilon^{\alpha\beta}\,\epsilon^{ij} (L_{i\,\alpha}^w\,L_{j\,\beta}^x) (H_k\,H_l)\,\epsilon^{kl}

(Harlander et al., 2023)

  • Extensibility: Adding new fields, representations, or symmetry groups requires only modification of the model file. AutoEFT has been used for the Standard Model Effective Field Theory (SMEFT), SMEFT plus gravity (GRSMEFT), Minimal Flavor Violation (MFV) extensions, QED, QCD, and numerous BSM cases.

4. Algorithmic Performance and Scaling

  • Scalability: Operator enumeration grows exponentially with mass dimension (dd). SMEFT at d=10d=10 yields O(104)\mathcal{O}(10^4) operators, d=12d=12 approaches O(105)\mathcal{O}(10^5). The practical limit is set by computer memory and disk rather than algorithmic overhead.
  • Computation Time: Bases up to d=10d=10 are computed within hours on a multi-core CPU; d=12d=12 can require weeks to months. Efficient on-the-fly pruning of EoM, IBP, and Fierz redundancies prevents the combinatorial explosion characteristic of "pre-basis →\to reduction" methods.
  • Limits: Generation is limited to on-shell operators; evanescent and gauge-variant counterterms are not included. For n>9n>9 identical fields, explicit Sn_n generators must be provided. Each operator must involve at least three fields (Harlander et al., 2023).
Mass dimension dd Operators (SMEFT, n=3n=3, est.) Compute time
6 O(103)\mathcal{O}(10^3) Minutes (desktop)
10 O(104)\mathcal{O}(10^4) Hours–day (multicore)
12 O(105)\mathcal{O}(10^5) Weeks–months (high memory)

5. Integration with Machine Learning: Transformer-Based AutoEFT

Recent work has demonstrated the use of LLMs based on transformers to automate Lagrangian generation given arbitrary field content, using tokenized object representations that encode spin, group representations, and charge assignments (Koay et al., 16 Jan 2025). In this context, AutoEFT serves as the redundancy-removal "oracle" in a hybrid pipeline:

  • Data Pipeline: Autogenerate EFT interaction terms up to a fixed operator dimension using classical AutoEFT, translating field content to token sequences.
  • Transformer Model: BART-style encoder/decoder (12 layers each, 16 heads, hidden dim 1024) trained on tens of 10510^5 generated Lagrangians, achieving >>90% sequence-level accuracy for up to six fields.
  • Embedding Analysis: Input embedding vectors cluster according to spin, gauge representation, and charge. Conjugation is internally encoded as a vector direction in embedding space, indicating the model internalizes crucial group-theoretical invariants.
  • Hybrid Pipeline: User-provided field content is tokenized, and the trained model outputs candidate invariant terms, which are then algebraically processed by AutoEFT to remove redundant operators. This supports generalization and rapid basis construction, with LaTeX/Wolfram/Sympy exports (Koay et al., 16 Jan 2025).

6. Applications, Limitations, and Directions

  • Applications: AutoEFT is deployed for operator basis construction in SMEFT, BSM scenarios, gravity extensions, and flavor theories. Canonical operator bases (e.g., Warsaw, SILH) can be selected post-generation. Algorithmic stratification enables calculations relevant to LHC, flavor factories, and gravitational effective field theory.
  • Limitations: The operator count and memory scale exponentially with operator mass dimension and number of flavors. Only on-shell, gauge-invariant, and n≥3n\geq3 field operators are generated; evanescent operators and off-shell counterterms are excluded. Custom group extensions (e.g., SO(NN), Sp(NN)) require additional coding.
  • Potential Extensions: Planned developments include support for alternative symmetry groups, direct interface to one-loop matching algorithms, inclusion of spurion insertions for higher-order MFV, and translation between operator bases. Extension of the framework to generate evanescent and gauge-variant structures for full renormalization is an open target (Harlander et al., 2023).

7. Summary

AutoEFT provides a unified, redundancy-avoiding, on-shell operator generation framework for effective field theories. By integrating algebraic and group-theoretical logic in both classical (symbolic) and machine learning (transformer) architectures, it automates a previously labor-intensive workflow fundamental to BSM, flavor, and gravitational model building. Operator bases for the SMEFT and its extensions, including gravity, are routinely constructed to high-mass dimension in hours to weeks on modern hardware. Open-source implementation and extensibility to new models and symmetries make AutoEFT a standard computational tool for the EFT physics community (Schaaf, 2023, Harlander et al., 2023, Koay et al., 16 Jan 2025).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to AutoEFT.