Orthogonal Low-Rank Adaptation in Lie Groups

Updated 14 September 2025

Orthogonal Low-Rank Adaptation in Lie Groups is a parameter-efficient strategy that leverages Lie group theory and low-rank updates to modify neural network parameters while preserving geometric invariants.
It employs multiplicative updates via matrix exponentials and Householder reflections to enforce explicit orthogonality constraints, thereby stabilizing continual learning and fine-tuning.
The approach enhances model performance in large language models by reducing catastrophic forgetting and computational overhead, enabling effective continual adaptation.

Orthogonal Low-rank Adaptation in Lie Groups (OLieRA) is a parameter-efficient adaptation strategy that leverages the algebraic and geometric structure of Lie groups—primarily the orthogonal group $\mathbb{O}(d)$ —to construct multiplicative, low-rank updates with explicit orthogonality constraints. By aligning adaptation directions with the symmetry and manifold structure of model parameter spaces, OLieRA addresses the instability and capacity bottlenecks inherent in conventional additive low-rank tuning, particularly under continual or multi-task learning regimes in LLMs. This approach synthesizes advances from representation theory, matrix analysis, geometric optimization, and modern deep learning practice.

1. Mathematical Foundations

OLieRA formalizes the adaptation problem using the theory of Lie groups and Lie algebras. Model parameter spaces, especially when considering weight matrices of LLMs or deep networks, are treated as (subsets of) matrix Lie groups such as $\mathbb{O}(d)$ , the group of $d \times d$ real orthogonal matrices, or more general reductive groups. The crucial insight is that adaptation operations should remain on these manifolds to ensure the preservation of geometric invariants and stable learning trajectories (Cao et al., 7 Sep 2025).

A typical update has the form

$W_{\text{new}} = W \cdot \exp(\Delta)$

where $W$ is the base weight, $\Delta$ is a learnable low-rank matrix from the associated Lie algebra (e.g., the skew-symmetric matrices for $\mathbb{O}(d)$ ), and $\exp(\Delta)$ is the matrix exponential mapping algebra elements to the group. This ensures that $W_{\text{new}}$ remains in the same geometric class as the original $W$ .

Orthogonality is enforced either directly through constraints on the low-rank factors (e.g., $U^TU=I$ for $\Delta = U V^T$ ) or via multiplicative chains of Householder reflections or block-diagonal constructions, aligning the adaptation subspace with the Stiefel manifold or the product of lower-dimensional orthogonal groups (Yuan et al., 24 May 2024, Feng et al., 17 Jan 2025).

2. Connection to Clifford Algebras and Classical Isomorphisms

The relationship between orthogonal groups and Clifford algebras supplies the structural underpinning for OLieRA (Shirokov, 2014). Groups defined within Clifford algebras—such as

${}^2C(p,q) = \{U \in C^{\text{even}}(p,q) \mid U^{\sim} U = e\}$

—are isomorphic to classical orthogonal or pseudo-orthogonal groups $\mathbf{O}(n)$ or $\mathbf{O}(p,q)$ , depending on the signature. Explicitly, for $p-q \equiv 0,1,2 \mod 8$ ,

${}^{2_3}C(p,q) \cong \begin{cases} \mathbf{O}\left(2^{n/2},\mathbb{R}\right) & \text{if } (p,q) = (n,0),\, n \text{ even} \ \mathbf{O}\left(2^{(n-1)/2},\mathbb{R}\right) \times \mathbf{O}\left(2^{(n-1)/2},\mathbb{R}\right) & (p,q) = (n,0),\, n \text{ odd} \end{cases}$

These isomorphisms enable parameter reductions via block-diagonal structures and inform the design of low-rank and orthogonally-constrained parameterizations in practical algorithms.

The graded structure (e.g., quaternion-type decomposition) of Clifford algebras provides natural subalgebras corresponding to infinitesimal group actions, supporting low-rank updates operating within specific symmetry directions.

3. Algorithms and Implementation Strategies

Leading OLieRA strategies operate by parameterizing adaptation as either:

Multiplicative Low-Rank Exponentials: Adaptation parameters are low-rank matrices $\Delta$ within the Lie algebra, with $W_{\text{new}} = W\exp(\Delta)$ , ensuring group-structural fidelity (Cao et al., 7 Sep 2025).
Householder Chains: Adaptation is realized with products of Householder reflections,

$H^{(r)} = \prod_{i=1}^r (I - 2u_iu_i^T)$

to produce an orthogonal adapter. The adapted matrix becomes $W H^{(r)}$ , which can be re-expressed as a low-rank additive update with structure determined by the $u_i$ (Yuan et al., 24 May 2024).

Orthogonalized Mixture-of-Experts: A set of expert adapters is orthogonalized via Gram–Schmidt or related transforms, ensuring parameter diversity and non-redundancy (representations live on the Stiefel manifold), especially in mixture-of-experts architectures (Feng et al., 17 Jan 2025).

Enforcing orthogonality in the low-rank subspaces can be achieved with explicit constraints (e.g., Gram–Schmidt), variational regularization (penalizing deviations from mutual orthogonality), or hard optimization on the Stiefel manifold.

Efficient Riemannian and stochastic optimization algorithms are employed to produce steps on the manifold, typically leveraging matrix exponentials, Cayley transforms, or geometric flows on $\mathbb{O}(d)$ (Choromanski et al., 2020). For large-scale settings, unbiased estimators (graph-theoretic sparsification, Givens rotations) reduce computational burden from cubic to sub-cubic complexity per iteration.

4. Applications in Neural Model Adaptation

OLieRA methods are primarily designed for parameter-efficient adaptation and continual learning in large-scale neural models, especially LLMs.

Continual Learning: OLieRA preserves learned knowledge across sequential tasks by making adaptation steps that remain on the parameter manifold and explicitly maintaining subspace orthogonality between different tasks' adaptation components, thereby reducing catastrophic forgetting without sacrificing model plasticity (Cao et al., 7 Sep 2025).
Equivariant Deep Networks and Representations: For tasks with inherent geometric symmetries (3D recognition, molecular modelling), OLieRA-like low-rank adaptations can exploit symmetry-preserving architectures implemented using tools such as the lie-nn library for equivariant neural networks (Batatia et al., 2023).
Parameter-Efficient Fine-Tuning: In LLM fine-tuning, OLieRA multiplicative updates or orthogonalized adapters achieve adaptation with fewer parameters, high training stability, and regularization inherited from the underlying Lie group structure.
Mixture-of-Experts and Adapter Diversity: Orthogonalization of multiple low-rank experts increases model capacity, discriminativeness, and efficiency, outperforming standard (non-orthogonalized) MoE approaches with much less redundancy (Feng et al., 17 Jan 2025).

Notably, these methods maintain or only minimally impact predictive accuracy on challenging benchmarks, even with as little as $0.12\%$ of parameters fine-tuned (with HRA) or a reduced number of experts (with OMoE).

5. Theoretical and Empirical Benefits

The principal advantages of OLieRA are:

Geometry Preservation: Updates remain faithful to the original manifold geometry, maintaining invariances critical for generalization and minimizing side effects on prior knowledge (Cao et al., 7 Sep 2025).
Orthogonality and Regularization: By explicitly enforcing orthogonality in adaptation subspaces, interference between tasks is suppressed and overfitting is mitigated (Feng et al., 17 Jan 2025, Yuan et al., 24 May 2024).
Efficient Computation: Use of low-rank factors and sampling-based geometric flows supports scaling to large models and datasets, reducing computation and memory overhead (Choromanski et al., 2020).
Adaptable Framework: The formulation accommodates a variety of parameter symmetry groups (orthogonal, unitary, pseudo-orthogonal) via the selection of appropriate Lie group and algebra isomorphisms (Shirokov, 2014, Batatia et al., 2023).

6. Limitations and Future Directions

Challenges of OLieRA include the computational cost of matrix exponentials, enforcing hard orthogonality in high dimensions, and integrating manifold optimization frameworks with conventional deep learning tools.

Optimization of hyperparameters (rank, regularization strength), efficient offline/on-the-fly orthogonalization, and generalization to broader classes of matrix groups remain areas of active development. Further, while theoretical alignment with Lie group geometry is principled, empirical gains can depend on model dimensionality, adaptation regime, and tuning.

A plausible implication is that more advanced numerical methods for manifold operations and further exploration of Clifford algebra-based structures could yield even more parameter-efficient and robust adaptation methods.

7. Summary Table: OLieRA Algorithmic Variants

Approach	Main Mechanism	Orthogonality Constraint
Lie Group Multiplicative	$W \exp(\Delta)$	$\Delta$ low-rank, $U^TU=I$
Householder Reflections	$H^{(r)} = \prod (I-2uu^T)$	$u_i$ (unit vectors), Gram–Schmidt regularization
Orthogonal MoE (OMoE)	LoRA experts, Gram–Schmidt	Experts on Stiefel manifold

This organization reflects the dominant taxonomies found in the literature (Yuan et al., 24 May 2024, Feng et al., 17 Jan 2025, Cao et al., 7 Sep 2025) and their empirical motivations.

In sum, Orthogonal Low-rank Adaptation in Lie Groups operationalizes symmetry and geometric structure in parameter-efficient neural network adaptation, achieving strong continual learning performance, robust parameter sharing, and scalable fine-tuning by integrating low-rank modeling, orthogonality, and Lie group theory.