Cayley Transform Parametrization
- Cayley Transform Parametrization is a rational mapping from Lie algebra elements (skew-symmetric or anti-Hermitian) to Lie groups like SO(n) and SU(n), ensuring automatic orthogonality/unitarity.
- It enables efficient computation in diverse fields such as quantum mechanics, optimization on manifolds, machine learning, and robotics by using unconstrained variable representations.
- Extensions like the scaled and block-skew variants allow global representations for matrices with problematic eigenvalues and facilitate robust optimization on the Stiefel manifold and SE(3) spaces.
The Cayley transform parametrization is a classical and versatile technique that provides a rational (non-exponential) map from the Lie algebra of skew-symmetric (or anti-Hermitian, or more generally Lie algebra) elements to certain matrix Lie groups such as SO(n), SU(n), and the Stiefel manifold. Its algebraic structure, computational efficiency, and unconstrained variable usage have made it fundamental in diverse areas, including quantum mechanics, optimization on manifolds, machine learning, molecular simulation, K-theory, and geometric statistics.
1. Mathematical Foundation of the Cayley Transform Parametrization
The Cayley transform for a Lie group (e.g., , ) is a rational map from the associated Lie algebra (typically skew-symmetric or anti-Hermitian matrices) to . The essential form is
where is skew-symmetric () or anti-Hermitian (), and is invertible. For , one takes with the spin generators, , and (Kortryk, 2015).
For real orthogonal matrices , the Cayley transform maps skew-symmetric to orthogonal via , if is invertible (Biborski, 22 Jan 2026). Conversely, for certain , especially with eigenvalue , a "signature" diagonal is used so that lies in the Cayley domain. Then is a global representation for all matrices (Biborski, 22 Jan 2026, Helfrich et al., 2017).
A central property is that orthogonality/unitarity is automatic: if , then
since the product . Analogous constructions hold for skew-Hermitian and the unitary group .
2. Extensions: Scaled and Generalized Cayley Parametrizations
The classical Cayley parametrization does not cover matrices with eigenvalue because becomes singular. To resolve this, Kahan and successors introduced a "scaled Cayley transform": where is skew-symmetric and is a fixed diagonal of signs for (Helfrich et al., 2017, Maduranga et al., 2018). In the complex/unitary case, becomes skew-Hermitian and is a diagonal of phases, allowing a differentiable parameterization of with and as unconstrained variables (Maduranga et al., 2018).
For rectangular orthogonal matrices and the Stiefel manifold $\St(n,p)$, the Cayley transform is generalized via block-skew coordinates. Given a "center" , the following map holds for $U\in \St(n,p)$: with and , so every in an open dense subset has an unconstrained vector representation (Kume et al., 2023, Kume et al., 2023, Jauch et al., 2018).
The Cayley parametrization is further extended to the group of rigid-body motions in robotics as: where is a 6-vector encoding translation and rotation, and is its se(3) matrix representation (Barfoot et al., 2021).
3. Computational Techniques and Analytical Properties
A key computational advantage is that the Cayley transform yields polynomial or rational forms in unconstrained variables, which allows efficient evaluation, gradient computation, and inversion. In the spin- case, the Cayley transform of a spin operator leads to a degree-$2j$ matrix polynomial whose coefficients are explicit rational functions (in ) and can be computed via resolvent expansions and truncations of determinants (Kortryk, 2015).
In high-dimensional settings, the construction of the Cayley parameters can be performed algorithmically, e.g., for any the signature matrix can be found by a Gaussian-elimination-style pivoting in arithmetic steps, such that , with (Biborski, 22 Jan 2026).
For over-parameterized cases like the Stiefel manifold (), the parameter space is a Euclidean ball of dimension , and direct optimization in this space is possible. Adaptive recentring strategies, which change the center whenever optimization steps approach a domain singularity, mitigate ill-conditioning and ensure bounded parameter norms (Kume et al., 2023, Kume et al., 2023).
In optimization, Cayley-based retractions serve as efficient alternatives to matrix exponentials, QR, or polar decompositions. The iterative Cayley retraction on Stiefel avoids inversion and achieves complexity per step, markedly accelerating training in deep learning or Riemannian settings (2002.01113).
4. Practical Applications in Physics, Machine Learning, and Geometry
Cayley parametrization is utilized in a spectrum of modern applications:
- Quantum spin and theory: Offers a closed, computationally efficient polynomial form for arbitrary spin-j rotations, advantageous over the exponential/CFZ expansion for iterative algorithms and rational approximations (Kortryk, 2015).
- K-theory and operator algebras: The Cayley transform provides an explicit isomorphism at the level of cycles between van Daele -theory and KK-theory for graded -algebras. It preserves Morita equivalence, compatibility with Kasparov products, and index pairings, underpins -theory, and enables explicit representative computations for topological phases of matter (Bourne et al., 2019).
- Robotics and Vision: The SE(3) Cayley map enables fast, fully polynomial pose updates without transcendental functions, with improved numerical conditioning for iterative pose alignment (the "CayPer" algorithm) and geometric optimization (Barfoot et al., 2021).
- Deep Learning:
- Orthogonal and unitary RNNs enforce norm-preserving transformations via Cayley-parametrized weight matrices, yielding better gradient flow and robustness. The "scaled Cayley" overcomes blind spots at eigenvalue , allowing fully expressive, differentiable unitary or orthogonal parameterizations (Helfrich et al., 2017, Maduranga et al., 2018).
- Orthogonal convolution layers via Cayley transforms parameterize filters as skew-symmetric convolutions, enabling direct, fast, and exact orthogonality constraints in both spatial and Fourier domains. This confers stability, tighter Lipschitz bounds, and certified adversarial robustness (Trockman et al., 2021).
- Layerwise and end-to-end Lipschitz control in 1D CNNs is enabled by combining Cayley-orthogonal parameterizations with the controllability Gramian, resulting in architectures with provable robustness and unconstrained training (Pauli et al., 2023).
- Optimization on Manifolds:
- On the Stiefel (and Grassmann) manifolds, the Cayley parametrization converts orthogonality constraints into unconstrained vector optimization, facilitating Euclidean algorithm transfer, fast iteration, and global convergence guarantees (Kume et al., 2023, Kume et al., 2023, 2002.01113, Jauch et al., 2018).
- Data Fitting and Geometric Statistics:
- Ellipsoid fitting via Cayley parametrization replaces constrained search over orthogonal transformations with unconstrained skew-symmetric blocks, allowing globally elliptic solutions and efficient nonlinear least-squares with explicit gradients (Melikechi et al., 2023).
- Stochastic simulation and random matrix theory exploit the Cayley change-of-variables to sample from Stiefel or Grassmann distributions, with Jacobians computed via Kronecker-structured derivatives and asymptotic normal approximations linking random Cayley parameters to Gaussians under Haar measure (Jauch et al., 2018).
- SU(3) Gauge Theory Simulation:
- Modified Cayley maps for include a nonlinear phase to enforce unit determinant, providing a local diffeomorphism to suitable for Hamiltonian splitting and hybrid Monte Carlo integration in lattice gauge theory. This construction maintains reversibility, preserves volume, and can enhance numerical stability compared to exponential mappings (Schäfers et al., 2024).
5. Limitations and Handling of Singularities
The surjectivity of the classical Cayley map fails on group elements with eigenvalue (or, for , on 180° rotations). Scaled variants (introducing a diagonal or phase factors in complex/unitary cases) guarantee global parameterizations by moving potential singularities away from the current chart (Helfrich et al., 2017, Biborski, 22 Jan 2026, Maduranga et al., 2018). In Stiefel/Grassmann optimization the domain of the Cayley map is only an open dense subset, but adaptive strategies (center point shifting) or pivoting ensure that optimization iterates remain within the domain (Kume et al., 2023, Kume et al., 2023).
Jacobians and differentials of the transform can be singular at chart boundaries, complicating change-of-variable formulas or gradient propagation. In practice, careful numerical safeguards (e.g., monitoring parameter norms, recentering) are employed to avoid these ill-conditioned regions (Jauch et al., 2018, Kume et al., 2023).
6. Comparative Analysis With Exponential and Other Parametrizations
The Cayley transform is a rational function in the algebra parameter, which avoids trigonometric and transcendental evaluations required by the exponential map. For , the Cayley polynomial for maps directly and exactly to the exponential parametrization via ; the Cayley coefficients are purely rational and more amenable to analytic and numerical computation than the truncated Taylor/arcsin series of the exponential (CFZ) expansion (Kortryk, 2015).
In optimization and machine learning, the Cayley parametrization leads to O() complexity per orthogonality-enforcing update or layer, matching or improving on polar or QR retraction methods especially in high dimensions, and conferring exactness by construction (2002.01113). Compared to penalty or SVD-based approaches for orthogonal convolutions, the Cayley approach is both more expressive and computationally efficient (Trockman et al., 2021).
For Lie group-valued integration in molecular dynamics (), modified Cayley transforms offer improved step-size stability and acceptance rates compared to exponential maps, notably in moderate–coarse regimes; standard exponential maps retain advantages at very small step sizes (Schäfers et al., 2024).
7. Summary Table of Canonical Forms and Their Domains
| Group / Manifold | Classical Cayley Map | Global Extension | Domain / Notes |
|---|---|---|---|
| , sign | invertible; all | ||
| , | , unitary phase | invertible | |
| Adaptive center, block-skew | Open dense set | ||
| as above | rotation excluded | nonsing | |
| Modified with phase | Local diffeomorphism |
References
- "Cayley transforms of su(2) representations" (Kortryk, 2015)
- "A Constructive Cayley Representation of Orthogonal Matrices and Applications to Optimization" (Biborski, 22 Jan 2026)
- "Orthogonal Recurrent Neural Networks with Scaled Cayley Transform" (Helfrich et al., 2017)
- "Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform" (Maduranga et al., 2018)
- "Adaptive Localized Cayley Parametrization for Optimization over Stiefel Manifold" (Kume et al., 2023)
- "Generalized Left-Localized Cayley Parametrization for Optimization with Orthogonality Constraints" (Kume et al., 2023)
- "Efficient Riemannian Optimization on the Stiefel Manifold via the Cayley Transform" (2002.01113)
- "Orthogonalizing Convolutional Layers with the Cayley Transform" (Trockman et al., 2021)
- "Random orthogonal matrices and the Cayley transform" (Jauch et al., 2018)
- "Ellipsoid fitting with the Cayley transform" (Melikechi et al., 2023)
- "A modified Cayley transform for SU(3) molecular dynamics simulations" (Schäfers et al., 2024)
- "Vectorial Parameterizations of Pose" (Barfoot et al., 2021)
- "The Cayley transform in complex, real and graded -theory" (Bourne et al., 2019)
- "Lipschitz-bounded 1D convolutional neural networks using the Cayley transform and the controllability Gramian" (Pauli et al., 2023)
- "Cayley parametrization and the rotation group over a non-archimedean pythagorean field" (Mahmoudi, 2016)
The Cayley transform parametrization is thus a structurally transparent, computationally efficient, and algebraically robust tool for representing, analyzing, and optimizing orthogonality and unitarity constraints in both finite and infinite-dimensional settings, across mathematics, physics, engineering, and machine learning.