Group-Equivariant Convolutions

Updated 24 June 2026

Group-equivariant convolutions are convolutional operations that enforce equivariance to transformations like rotations, reflections, and translations, thereby enhancing invariance.
They use filter parameterization and weight sharing across group elements to reduce sample complexity without inflating the number of parameters.
These methods yield improved data efficiency and accuracy, as demonstrated in tasks such as rotated MNIST, CIFAR-10, and various scientific imaging applications.

Group-equivariant convolutions generalize classical convolutional neural networks (CNNs) by building exact equivariance to user-specified transformation groups—such as rotations, reflections, translations, permutations, and scalings—into the architecture at the layer level. Rather than restricting symmetry to translations (as in ordinary CNNs), group-equivariant CNNs (G-CNNs) use convolution-like operations indexed by elements of a symmetry group. This structural modification enables networks to exploit global or local invariance under group actions, reduces sample complexity, and increases expressive power without parameter inflation. The mathematical foundation, implementation frameworks, and performance characteristics of group-equivariant convolutions have been extensively studied since their introduction by Cohen and Welling (Cohen et al., 2016).

1. Mathematical Formalism and Equivariance Property

Let $G$ denote a (finite or Lie) group acting transitively on a domain $X$ (for instance, planar translations, rotations by multiples of $90^\circ$ , or the Euclidean group $E(2)$ for images). In standard CNNs, feature maps are functions $f: \mathbb{Z}^2 \to \mathbb{R}^c$ , with translation-equivariant convolution. In G-CNNs, the domain of feature maps is extended to $f: G \to \mathbb{R}^c$ , where $c$ is the channel count.

The canonical group convolution (for $f: G \to \mathbb{R}^c$ and filter $\psi: G \to \mathbb{R}^c$ ) is defined by

$(f \star \psi)(g) = \sum_{h \in G} \langle f(h), \psi(g^{-1}h) \rangle.$

This operation commutes with the left-regular action $X$ 0: $X$ 1 A layer is called $X$ 2-equivariant if it satisfies $X$ 3 for all $X$ 4. This equivariance guarantees that group transformations of an input propagate consistently through the network, preserving the symmetry structure at every layer (Cohen et al., 2016, Cohen et al., 2018, Aronsson, 2021).

In group-equivariant architectures, the filter bank $X$ 5 is defined on $X$ 6, not just on a local region of the input space. For practical groups like the plane symmetry groups $X$ 7 (translations and $X$ 8 rotations) and $X$ 9 (plus reflections), $90^\circ$ 0 is small ( $90^\circ$ 1 or $90^\circ$ 2), so filters are extended by group action from a canonical template, significantly increasing weight sharing.

A key implementation strategy is to learn a base filter $90^\circ$ 3 on a reference domain, and generate all $90^\circ$ 4 variants via group action:

For each $90^\circ$ 5 in the stabilizer subgroup $90^\circ$ 6 (e.g., the $90^\circ$ 7 rotations of $90^\circ$ 8), $90^\circ$ 9 is precomputed.
The transformed templates are stacked as output channels and convolved using standard (planar) routines.
Subsequent layers operate on feature maps living on $E(2)$ 0, i.e., at each group element there is a full set of channels (Cohen et al., 2016).

The resulting parameter efficiency is substantial: at fixed parameter count, the expressive capacity (i.e., diversity of filters seen at different transformations) increases linearly with $E(2)$ 1 (Cohen et al., 2016).

3. Steerable Kernels and Bi-Equivariance Constraints

Beyond regular group convolutions, steerable G-CNNs (for general homogeneous spaces $E(2)$ 2) incorporate non-scalar features referred to as "fields" or vector bundles, transforming under representations $E(2)$ 3 of a stabilizer subgroup $E(2)$ 4. Kernels $E(2)$ 5 then satisfy a bi-equivariance constraint: $E(2)$ 6 for all $E(2)$ 7, $E(2)$ 8, $E(2)$ 9 (Cohen et al., 2018). The parameterization of such kernels is organized via the induced representation theory (Mackey theory), and—for compact groups—using harmonic analysis and Clebsch-Gordan decompositions, the entire constraint can be solved explicitly, as in the Wigner-Eckart theorem (Lang et al., 2020).

Examples include:

$f: \mathbb{Z}^2 \to \mathbb{R}^c$ 0-equivariant steerable CNNs with kernels expanded in analytic radial profiles and angular harmonics, satisfying $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 1 for any $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 2 (Weiler et al., 2019).
Spherical and 3D steerable CNNs using spherical harmonics or tensor fields (Lang et al., 2020, Cohen et al., 2018).

4. Architectural Variations and Unified Frameworks

Group-equivariant convolutions admit several architectural generalizations:

Continuous Groups and Lie Groups: Convolutions over compact or non-compact Lie groups (e.g., $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 3, $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 4 for scale-rotation, $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 5 for 3D data) are realized via integration using the Haar measure, with filter parameterization via Lie algebra coordinates and MLPs (Qiao et al., 2023).
PDE-based G-CNNs: Layers are treated as steps of a G-invariant partial differential equation on a homogeneous space, combining linear group convolution and nonlinear morphological convolution (dilation/erosion), enabling equivariance to larger symmetry groups and avoiding explicit non-linearities such as ReLU/max-pool (Smets et al., 2020, Diop et al., 10 Feb 2026).
Fourier Domain Equivariance: Frequency-domain group convolutional layers operate on the group Fourier transform, extending global symmetry to spectral neural operators with improved generalization in PDE tasks (Helwig et al., 2023).
Attention Mechanisms: Attentive group-equivariant convolutions implement symmetry-compatible attention by enforcing the joint equivariance of convolution and weighting coefficients (Romero et al., 2020).
Lie Groupoid/Lie Algebroid Equivariance: Category-theoretic extensions realize equivariant neural networks for groupoids and their infinitesimal analogues, connecting convolution to natural transformations in a categorical setting (Astwood, 1 Jun 2026).

A general theory classifies all equivariant linear maps between suitable feature field spaces as (generalized) group convolutions with kernels constrained by representation theory (Cohen et al., 2018, Aronsson, 2021).

5. Parameter and Computational Efficiency

Group-convolutions achieve increased expressive power per parameter via explicit weight sharing:

Each G-convolutional filter template is shared across $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 6 transformations.
In practical $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 7/ $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 8 settings, parameter count is kept constant by reducing the number of channels per group element so that total memory and compute are on par with standard CNNs (Cohen et al., 2016).
Filter redundancy can be further exploited: learned filters are often highly correlated along the group axis, enabling depthwise-separable decompositions (factorizing group and spatial kernels) for further parameter and compute reduction at no loss in equivariance (Lengyel et al., 2021).
Inhomogeneous or hybrid architectures may combine G-convolutions and standard convolutions to trade off expressivity vs. imposed symmetry (Helwig et al., 2023).

6. Empirical Results, Expressivity, and Practical Impact

Group-equivariant convolutions consistently demonstrate improved accuracy and data efficiency, particularly when the target task exhibits symmetry:

Rotated MNIST: $f: \mathbb{Z}^2 \to \mathbb{R}^c$ 9-CNN achieves 2.28% error (vs 5.03% for Z $f: G \to \mathbb{R}^c$ 0-CNN), and further gains with attentive mechanisms or steerable variants (Cohen et al., 2016, Romero et al., 2020).
CIFAR-10: p4m-ResNet44 achieves 4.94% error (vs 5.61% for translation-only baseline) at fixed parameter budgets (Cohen et al., 2016).
PDE and medical imaging: PDE-G-CNNs and SIM(2)-CNNs achieve equal or superior accuracy with an order of magnitude fewer parameters than classical CNNs (Smets et al., 2020, Qiao et al., 2023).
Generative modeling: Group-equivariant GANs yield lower FID and faster convergence, especially in small-data or high-symmetry regimes (Dey et al., 2020).
Koopman operator learning: Explicit group-convolutional structure reduces data and compute requirements for dynamic mode decomposition (Harder et al., 2024).

The improvement is most pronounced in regimes with (i) built-in transformation symmetry, (ii) limited data, and (iii) strict parameter/memory constraints. The limitation is that equivariant models cannot represent symmetry-breaking tasks as efficiently as standard CNNs.

7. Applications, Limitations, and Future Directions

Applications broadly span rotated and omnidirectional vision, point cloud and molecular geometry processing, PDE surrogate modeling, generative modeling, and dynamical systems identification (Basheer et al., 2024, Cohen et al., 2016, Qiao et al., 2023). Notable limitations include:

High computational cost of steerable/continuous-group layers (especially Clebsch-Gordan expansions in 3D or for higher-order fields).
Discretization artifacts for continuous groups.
Inflexibility or reduced expressivity if the data distribution only approximately exhibits the target symmetry.
Incomplete coverage for non-group symmetries (e.g., diffeomorphisms, category or groupoid actions), for which recent category-equivalent neural networks extend the framework (Astwood, 1 Jun 2026).

Anticipated directions include implicit parametrizations for steerable kernels, neural differential equations on manifolds/Lie groups, and unified category-theoretic generalizations. The continual expansion of group-equivariant convolutional strategies promises more robust, data-efficient, and interpretable neural architectures across geometric and scientific learning domains (Basheer et al., 2024).