Equivariant MLPs: Principles & Applications

Updated 13 May 2026

Equivariant MLPs are neural architectures that enforce symmetry by ensuring f(g · x) = g · f(x) for all group elements.
They utilize representation theory and nullspace methods to compute equivariant weight matrices, achieving universal approximation.
These models are applied in vision, molecular modeling, and graph networks, offering computational efficiency and improved data generalization.

Equivariant Multi-Layer Perceptrons (MLPs) are neural architectures in which each layer, including linear maps and nonlinear activations, is constructed to respect equivariance with respect to a specified group action. Given a group $G$ with action on both the input and output spaces, an MLP is $G$ -equivariant if, for all $g \in G$ and $x$ in the input space, $f(g \cdot x) = g \cdot f(x)$ . Such architectures are central in domains where symmetries—spatial, combinatorial, or algebraic—constrain or structure the problem, including vision, physical sciences, permutation-invariant or equivariant structures, and molecular modeling. Equivariant MLPs encompass and generalize convolutional neural networks, steerable networks, DeepSets, graph neural networks, and certain tensor field networks.

1. Mathematical Definition and Equivariance Criteria

Let $G$ be a finite or Lie group, and let the input and output spaces, $\mathbb{V}$ and $\mathbb{W}$ , carry linear representations $D_{\mathrm{in}}, D_{\mathrm{out}}: G \to \mathrm{GL}(n), \mathrm{GL}(m)$ . A function $f:\mathbb{V}\rightarrow\mathbb{W}$ is $G$ 0-equivariant if

$G$ 1

A linear map $G$ 2 is equivariant if $G$ 3 for all $G$ 4 (the intertwining operator condition). Nonlinearities $G$ 5 must further satisfy $G$ 6 for all $G$ 7. In practice, for permutation groups, coordinatewise activations (e.g., ReLU) and biases constant on orbits suffice; for more general groups, biases must inhabit the trivial isotypical component and nonlinearities may require more sophisticated equivariant design (e.g., for $G$ 8) (Lim et al., 2022).

2. Universality and Representation Theory

The universality of equivariant MLPs is guaranteed as follows: let $G$ 9 act regularly or diagonally on a hidden layer. Then, for any continuous $g \in G$ 0-equivariant map $g \in G$ 1 (for compact $g \in G$ 2) and $g \in G$ 3, there exists a two-layer $g \in G$ 4-equivariant MLP $g \in G$ 5 approximating $g \in G$ 6 uniformly on $g \in G$ 7,

$g \in G$ 8

with hidden-layer constraints $g \in G$ 9. Here, $x$ 0 is the regular or product representation, and universality follows from Reynolds symmetrization and explicit basis construction. For Abelian $x$ 1, e.g., cyclic groups, a width $x$ 2 regular layer suffices; for non-Abelian $x$ 3, universality is achievable with an order- $x$ 4 product representation, with $x$ 5 bounded by $x$ 6 for a $x$ 7-set $x$ 8 (Ravanbakhsh, 2020).

3. Construction Methods and Efficient Algorithms

For arbitrary finite or matrix groups, equivariant MLPs are constructed by reducing the intertwining constraints to a finite set of linear equations involving representation matrices for group generators (for discrete groups) or Lie algebra generators (for Lie groups). The solution space—basis of equivariant weight matrices—is computed efficiently using nullspace methods (Krylov-style, block decomposition by irreducible components). Each equivariant layer is then parameterized as a linear combination of basis elements. The system supports any group for which representations are supplied, including O(n), SO(n), Sp(n), SU(n), Lorentz group, and combinatorial groups like the Rubik’s cube group (Finzi et al., 2021, Pearce-Crump, 2023).

A summary algorithm:

Encode equivariance constraints for all generators into a linear system.
Compute a basis for the nullspace.
Parameterize the equivariant layer over this basis. This confers major computational advantages—e.g., for O(n)-equivariant maps between order- $x$ 9 and order- $f(g \cdot x) = g \cdot f(x)$ 0 tensor spaces, the basis is combinatorially small (Brauer diagrams), and fast multiplication schemes (factorized over permutations and Kronecker structures) decrease matrix-vector multiplication cost from $f(g \cdot x) = g \cdot f(x)$ 1 to $f(g \cdot x) = g \cdot f(x)$ 2 (Pearce-Crump, 2023).

4. Nonlinearities, Layer Composition, and Expressivity

Pointwise nonlinearities—ReLU, sigmoid, or similar—are equivariant under permutation, translation, or Abelian groups when applied coordinatewise. For non-Abelian or continuous groups, scalar- or norm-based nonlinearities, as well as tensor product and gating mechanisms, are required. In E(3)-equivariant architectures, spherical grid projections combined with standard MLPs per grid point, followed by projection back to irreps, yield efficient, exactly equivariant MLP-style nonlinearities. This avoids the complexity of full tensor product nonlinearities, enabling state-of-the-art molecular models (e.g., Facet). In such designs, normalization and bias addition are handled separately per irrep type to ensure equivariance (Miklaucic et al., 10 Sep 2025).

Layer stacking, with alternation of equivariant linear and nonlinear blocks, preserves $f(g \cdot x) = g \cdot f(x)$ 3-equivariance. Feature channels in each layer decompose by tensor order or irrep, and mixing across these blocks (e.g., in tensor-polynomial architectures like G-RepsNet) is implemented with block-diagonal matmuls and invariant-based nonlinear mixing (Basu et al., 2024).

5. Concrete Examples and Applications

Several canonical cases exemplify equivariant MLPs:

Convolutional layers (CNNs): For $f(g \cdot x) = g \cdot f(x)$ 4 (cyclic) translation, equivariant MLPs with circulant kernels are universal approximators of translation-equivariant functions.
DeepSets/Graph Networks: For permutation groups $f(g \cdot x) = g \cdot f(x)$ 5, equivariant linear maps are constrained to two-parameter forms, corresponding to DeepSets.
Physical sciences: EMLPs for $f(g \cdot x) = g \cdot f(x)$ 6, $f(g \cdot x) = g \cdot f(x)$ 7, $f(g \cdot x) = g \cdot f(x)$ 8 capture symmetries in dynamical systems, Hamiltonian flows, and molecular modeling, delivering invariance properties such as conservation laws (Finzi et al., 2021, Miklaucic et al., 10 Sep 2025).
General matrix groups: For non-trivial representations (Lorentz group, Rubik's cube group), EMLP and G-RepsNet frameworks implement equivariant MLPs at scale, with empirical gains in data efficiency and model accuracy (Basu et al., 2024, Finzi et al., 2021).
Spherical harmonics and tensor fields: In materials modeling, equivariant MLPs are implemented with irreducible-decomposition and grid-based or tensor product nonlinearities for efficient E(3) symmetry enforcement (Miklaucic et al., 10 Sep 2025).

6. Scalability, Parameter Efficiency, and Practical Considerations

EMLPs (Equivariant Multi-Layer Perceptrons) are universal for continuous $f(g \cdot x) = g \cdot f(x)$ 9-equivariant maps (for compact $G$ 0) but solving the full basis can become computationally infeasible for deep or high-order layers due to basis growth. G-RepsNet addresses this by using block-diagonal parameterization over tensor polynomials, requiring only a small fixed number of tensor orders (typically $G$ 1 or $G$ 2), achieving comparable accuracy with dramatically improved runtime and memory efficiency (Basu et al., 2024). For large $G$ 3 and low tensor order, methods based on combinatorial diagrammatics (Brauer, partition diagrams) scale very efficiently (Pearce-Crump, 2023).

Parameter count in G-RepsNet per layer is $G$ 4 for block size $G$ 5 and base dim $G$ 6, compared to $G$ 7 (often $G$ 8) for EMLP with full equivariant basis. Empirically, G-RepsNet and Facet architectures achieve competitive or superior generalization and data efficiency at reduced hardware cost.

Implementation best practices include:

Choosing representations and tensor order to match data and domain symmetry.
Leveraging software libraries (e.g., emlp (Finzi et al., 2021)) for basis computation and layer construction.
Mixing or projecting non-scalar features using invariant polynomials or grid-based nonlinearities, as appropriate for the symmetry.
Exploiting direct sum structure and parallelization for high-dimensional tensor features.

7. Extensions: Projective Equivariance and Advanced Generalizations

In some settings, linear equivariance is too restrictive, and projective equivariance—equivariance up to a scalar factor—is required. Projectively equivariant MLPs are constructed by lifting projective representations to linear representations on a covering group and imposing "twisted" equivariance constraints parameterized by characters. This yields the most general possible linear (projectively) equivariant architectures subject to stacking and nonlinearities that commute with the relevant character twist (Bökman et al., 2022). The overhead from charge-decomposed feature spaces is modest when the character group is small.

A plausible implication is that projectively equivariant MLPs expand expressivity to data with phase, sign, or class-dependent symmetry, subsuming standard equivariant MLPs when the character group is trivial.

References: (Ravanbakhsh, 2020, Lim et al., 2022, Finzi et al., 2021, Basu et al., 2024, Miklaucic et al., 10 Sep 2025, Pearce-Crump, 2023, Bökman et al., 2022)