Cartan Neural Networks
- Cartan Neural Networks are deep learning models that embed non-compact symmetric space geometry and Lie group structure directly into network layers.
- They replace traditional affine transforms with group-homomorphisms and isometric mappings, ensuring equivariant feature propagation and intrinsic nonlinearity.
- They extend to convolutional, harmonic, and quantum settings, demonstrating competitive performance on tasks like image classification and regression.
Cartan Neural Networks (CaNNs) are a modern class of neural architectures that embed group-theoretic and differential-geometric structures—specifically, the geometry of non-compact symmetric spaces—directly into the construction of neural network layers. These frameworks fundamentally generalize both traditional Euclidean and hyperbolic deep learning, providing an intrinsic, equivariant, and globally interpretable alternative for feature propagation, representation learning, and classification, particularly for hierarchical, structured, or symmetry-laden data.
1. Mathematical Foundations: Symmetric Spaces and Solvable Models
The core structural insight behind Cartan Neural Networks is the observation that certain manifold models—namely non-compact Riemannian symmetric spaces with a real simple non-compact Lie group and its maximal compact subgroup—admit a dual realization as both homogeneous spaces (coset manifolds) and solvable Lie groups with explicit coordinates and group law (Milanesio et al., 30 May 2025, Fré et al., 22 Jul 2025, Fré et al., 18 Dec 2025).
Given the Cartan decomposition (or, in the classical KAK context, for a compact group), one constructs the so-called Iwasawa or solvable decomposition , with a subalgebra built from Cartan (diagonal) and nilpotent (triangular) parts. Elements on are then coordinatized via , where are the solvable generators and are "solvable coordinates" (Milanesio et al., 30 May 2025, Fré et al., 22 Jul 2025, Fré et al., 18 Dec 2025).
The Riemannian metric (where is the left-invariant vielbein) is unique, positive-definite, and -invariant. The geometry is always Cartan–Hadamard: simply connected, complete, with strictly negative sectional curvature.
2. Cartan Layer Construction: Group Homomorphisms and Isometries
Each Cartan layer performs intrinsic feature propagation between symmetric space layers. The construction is as follows:
- Extract solvable coordinates: Pull back a hidden state (viewed as a matrix or point on ) into solvable coordinates via Cholesky decomposition or ordered exponentials, as detailed in (Fré et al., 22 Jul 2025).
- Group-homomorphism (linear map): Perform a group-theoretic analogue of an affine transformation, replacing standard matrix multiplication with a Lie-algebra-homomorphism . In hyperbolic () cases, this reads:
where and are learnable parameters (Milanesio et al., 30 May 2025, Fré et al., 22 Jul 2025).
- Exponentiation/map back: Push forward to the next layer's solvable group.
- Isometric fiber transformation (bias and "paint" rotation): Apply a metric-preserving diffeomorphism, corresponding to an isometry of the symmetric space, typically realized as a right-action by some , or in solvable form as a group translation and a rotation of the "paint" (off-diagonal) directions (Milanesio et al., 30 May 2025).
Mathematically, the composition for a Cartan layer is: which is -equivariant and operates intrinsically on (Fré et al., 22 Jul 2025).
3. Equivariance, Curvature, and Nonlinearity
A fundamental property is that every operation in a Cartan Neural Network is equivariant with respect to the corresponding Lie group —that is, data transformed by a group element before passing through a layer will result in the same group action applied to the output. This guarantees that learned representations respect the full isometry group of the underlying feature space (Milanesio et al., 30 May 2025, Fré et al., 22 Jul 2025, Fré et al., 18 Dec 2025).
Crucially, the negative curvature and exponential volume growth of introduce nonlinearity intrinsically, obviating the need for conventional pointwise activations. Cartan directions induce multiplicative (exponential) features, and nilpotent directions yield polynomial directions. This geometric nonlinearity enables deep expressivity, even in the absence of typical neural activations (Milanesio et al., 30 May 2025, Fré et al., 22 Jul 2025).
4. Extensions: Convolutional, Harmonic, and Quantum Cartan Networks
Cartan neural network methodology extends naturally to convolutional and harmonic settings as elaborated in the PGTS (Paint Group and Tits–Satake) framework (Fré et al., 22 Aug 2025). Each layer may be modeled as a (possibly higher-rank) symmetric space , with inter-layer maps realized as Lie algebra homomorphisms and equivariant vector bundle morphisms. Associated convolutional kernels are constructed using harmonic functions (eigenfunctions of the Laplacian, Selberg and Siegel theta functions), spinor harmonics, and heat-kernel expansions, respecting both the symmetric space structure and possible discretizations by tessellation (e.g., Fuchsian or Coxeter groups acting on the domain).
Furthermore, Cartan-layered neural architectures have been integrated with K-P sub-Riemannian control theory for quantum neural networks, where Cartan's KAK decomposition parameterizes geodesics for time-optimal quantum control, and these geodesics are exactly encoded by finite-depth neural ansätze with Cartan layers (Perrier, 2 Apr 2025). The resulting networks (EQNNs) are globally optimal for classes of Lie-group control problems under mild regularity conditions.
5. Training, Optimization, and Thermodynamic Geometry
Optimization in Cartan Neural Networks is performed by exploiting Riemannian optimization techniques, such as Riemannian SGD or Adam, applied to manifold-valued parameters (solvable-group biases, isometries) alongside ordinary Euclidean gradients for linear maps (Milanesio et al., 30 May 2025). Gradient flow and backpropagation are computed via the chain rule through group laws, isometries, and geometric exp/log maps, all handled by automatic differentiation in modern frameworks.
The geometric setting enables a deep connection to information geometry and thermodynamics. On Kähler non-compact symmetric spaces (those with including a factor), one can define covariant Gibbs probability distributions and explicit partition functions. The Fisher–Rao information metric and the Ruppeiner–Lychagin thermodynamic metric are literally the Hessians of , unifying statistical and thermodynamical geometry (Fré et al., 18 Dec 2025). Partition functions are fully group-invariant, and the minimal set of independent "generalized temperatures" required for description equals the rank of .
6. Empirical Results and Applications
Empirical evaluations demonstrate that Cartan networks match or surpass standard Euclidean and Poincaré deep learning benchmarks on a variety of tasks, including synthetic regression and image classification (MNIST, Fashion-MNIST, KMNIST, CIFAR-10). Results show improved radial symmetry handling, increased expressivity with depth (even without nonlinearities), and competitive or superior performance using only intrinsic geometric operations (Milanesio et al., 30 May 2025). The architecture is adaptable to higher-rank and complex-structured spaces, and supports convolutional, harmonic, and sequential generalizations (Fré et al., 22 Aug 2025).
A summary table of key Cartan Neural Network variants and properties:
| Variant | Symmetric Space Layer Type | Key Operations |
|---|---|---|
| Basic Cartan Network | , e.g. | Group-hom, isometry, exp/log |
| Cartan Convolutional | + Tits–Satake vector bundles | Harmonic analysis, equivariant conv |
| Quantum Cartan Layered | Compact decomposition | Cartan geodesics, sub-Riemannian gluing |
| Thermodynamic/Kähler | Kähler coset with | Covariant Gibbs, Fisher–Rao metric |
The architecture's geometric interpretability—every operation, parameter, and feature map possessing a precise group-theoretic meaning—opens new directions in explainable AI, geometric deep learning, and physics-informed machine learning.
7. Conceptual Advances and Outlook
Cartan Neural Networks unify non-Euclidean geometry, Lie group representation theory, harmonic analysis, and neural computation. They provide a mathematically principled alternative to ad hoc nonlinearities, leveraging the full power of homogeneous space geometry. Key conceptual breakthroughs include:
- Replacement of affine+activation architecture by group-homomorphism+isometry layers on symmetric spaces (Milanesio et al., 30 May 2025, Fré et al., 22 Jul 2025).
- Equivariance to the full isometry group, guaranteeing symmetry-aware feature propagation and geometric interpretability.
- Intrinsic nonlinearity from negative sectional curvature, leading to expressivity and natural compositionality.
- Extensions to quantum control tasks (Cartan KAK-decomposition), convolutional and harmonic neural modules, and geometric thermodynamic/statistical modeling (Perrier, 2 Apr 2025, Fré et al., 22 Aug 2025, Fré et al., 18 Dec 2025).
The paradigm supports further developments, including high-rank symmetric space architectures, deeper integration with geometric statistics, spectral methods, and broader physical modeling applications across quantum dynamics, data geometry, and beyond.