Papers
Topics
Authors
Recent
Search
2000 character limit reached

Group Orthogonal Low-Rank Adaptation

Updated 12 December 2025
  • GOLA is a parameter-efficient fine-tuning framework that reduces redundancy through structured rank decomposition, selective freezing, and clustering.
  • It employs an inter-group orthogonality constraint to enforce diverse and complementary feature representation for enhanced RGB-T tracking.
  • Empirical results show that GOLA variants achieve superior tracking accuracy and efficiency with fewer trainable parameters compared to baseline methods.

Group Orthogonal Low-Rank Adaptation (GOLA) is a parameter-efficient fine-tuning framework designed to enhance feature expressiveness and minimize information redundancy in low-rank adaptation modules, particularly for RGB-T (Red-Green-Blue and Thermal) tracking tasks. GOLA builds upon the low-rank adaptation (LoRA) paradigm by introducing principled rank selection, parameter freezing, clustering, and a novel inter-group orthogonality constraint, resulting in improved adaptability and efficiency for downstream tracking applications (Shao et al., 5 Dec 2025).

1. Low-Rank Adaptation Preliminaries

GOLA operates within the standard low-rank adaptation framework where, given a pretrained backbone with a weight matrix WRdout×dinW \in \mathbb{R}^{d_\text{out} \times d_\text{in}} (e.g., for linear or attention-projection layers), fine-tuning is constrained to a learnable low-rank “adapter” ΔW\Delta W. In LoRA, the adapted layer computes

h=Wh+(BA)hh' = W h + (BA) h

with ARr×dinA \in \mathbb{R}^{r \times d_\text{in}}, BRdout×rB \in \mathbb{R}^{d_\text{out} \times r}, and rmin(din,dout)r \ll \min(d_\text{in}, d_\text{out}). At inference, this merges to W=W+BAW' = W + BA. The update ΔWBA\Delta W \triangleq BA can be equivalently expressed through a singular value decomposition (SVD):

ΔW=UΣV\Delta W = U \Sigma V^\top

where URdout×rU \in \mathbb{R}^{d_\text{out} \times r}, VRdin×rV \in \mathbb{R}^{d_\text{in} \times r}, and ΣRr×r\Sigma \in \mathbb{R}^{r \times r} are respectively the left and right singular vectors and singular values.

2. Quantifying Rank Importance through Decomposition

GOLA's central innovation is identifying redundancy within the rank space produced by LoRA-style adapters. This is accomplished by performing an SVD on the mean-centered BB matrix:

Bˉ=B1r11B\bar{B} = B - \frac{1}{r} 1 1^\top B

followed by

[U,Σ,V]=SVD(Bˉ)[U, \Sigma, V] = \mathrm{SVD}(\bar{B})

where Σ=(σ1,...,σr)\Sigma = (\sigma_1, ..., \sigma_r), σ1...σr0\sigma_1 \geq ... \geq \sigma_r \geq 0. The top-kk singular vectors VkRdout×kV_k \in \mathbb{R}^{d_\text{out} \times k} and values Σk\Sigma_k are set as reference directions.

An L2L_2-normalized importance score SjS_j is then computed for each original column bjRdoutb_j \in \mathbb{R}^{d_\text{out}} of BB by:

Sj=(Vkbj)Σk2S_j = \left\| (V_k^\top b_j) \odot \Sigma_k \right\|_2

with \odot denoting elementwise multiplication. Stacking all SjS_j into SRrS \in \mathbb{R}^r and sorting descending yields an ordering σ\sigma such that Sσ1Sσ2SσrS_{\sigma_1} \geq S_{\sigma_2} \geq \ldots \geq S_{\sigma_r}.

3. Structured Freezing and Clustering of Ranks

GOLA categorizes ranks into “crucial” and “redundant” components. The top-kk indices {σ1,,σk}\{\sigma_1,\ldots,\sigma_k\}, corresponding to the highest SjS_j, are deemed crucial and their associated adapter columns/rows are frozen to preserve pretrained priors:

  • Ac={aσi}i=1kA_c = \{a_{\sigma_i}\}_{i=1}^k and Bc={bσi}i=1kB_c = \{b_{\sigma_i}\}_{i=1}^k (frozen)
  • The remaining rkr-k form AuA_u, BuB_u (unfrozen, “redundant”)

Redundant ranks are partitioned into nn groups using constrained kk-means clustering on the columns of BuB_u:

{G1,...,Gn}=Γ({bjj{σk+1,...,σr}},n)\{G_1, ..., G_n\} = \Gamma\left( \{b_j | j \in \{\sigma_{k+1}, ..., \sigma_r\}\}, n \right)

which minimizes the within-group sum of squares, subject to approximately balanced group sizes.

4. Inter-Group Orthogonality Constraint

To force redundant groups to learn diverse and complementary features, GOLA applies an inter-group orthogonality loss across the nn groups. With X,Y=XY\langle X, Y \rangle = |X^\top Y| measuring overlap, the regularizer is

Lorth=1i<jn(AuiAuj1+BuiBuj1)L_\mathrm{orth} = \sum_{1 \leq i < j \leq n} \left( \|A_{u_i}^\top A_{u_j}\|_1 + \|B_{u_i}^\top B_{u_j}\|_1 \right)

where AuiA_{u_i}, BuiB_{u_i} are adapters in group GiG_i. In practice, a random pair (i,j)(i, j) is sampled per iteration to compute this penalty, enhancing computational efficiency (Shao et al., 5 Dec 2025).

5. Training Objective and Optimization

The overall tracking model objective combines:

  • Classification loss LclsL_\mathrm{cls} (binary cross-entropy on predicted heatmaps)
  • Regression loss LregL_\mathrm{reg} (Generalized IoU)
  • Orthogonality regularizer LorthL_\mathrm{orth}

The total loss is

Ltotal=Lcls+Lreg+λLorthL_\mathrm{total} = L_\mathrm{cls} + L_\mathrm{reg} + \lambda L_\mathrm{orth}

with λ\lambda a small constant (set to 1.4×1031.4 \times 10^{-3} for all experiments).

6. Empirical Performance

GOLA exhibits improved parameter efficiency and performance over baseline LoRA and other parameter-efficient fine-tuning techniques (Adapter, VPT, (IA)3^3, AdaLoRA, DoRA). Two GOLA variants were implemented:

  • GOLA-B (DINOv2-B224 backbone): 99M parameters, 10% trainable, 85 GFLOPs, 125 fps on RTX 3090.
  • GOLA-L (DINOv2-L224 backbone): 336M parameters, 8% trainable, 284 GFLOPs, 64 fps.

Empirical results on four benchmarks validate GOLA’s superiority:

Dataset Metric Best Prior GOLA-B GOLA-L
GTOT MPR 93.2% 92.8% 95.3%
(50 seq) MSR 77.2% 78.5% 80.9%
RGBT210 PR 89.9% 90.9% 92.0%
(210k frames) SR 65.9% 67.0% 68.7%
RGBT234 MPR 92.1% 92.2% 92.8%
(234k frames) MSR 69.2% 69.5% 71.3%
LasHeR PR 76.9% 77.5% 78.1%
(735k frames) NPR 74.5% 73.9% 74.5%
SR 60.9% 61.6% 61.9%

Compared to LoRA (13% trainable), GOLA-B reduces trainable parameters by 23% while improving LasHeR PR/SR from 76.3%/60.7% to 77.5%/61.6%. Across benchmarks, clustered orthogonality consistently outperforms full fine-tuning and existing parameter-efficient fine-tuning methods (Shao et al., 5 Dec 2025).

7. Context and Implications

GOLA exemplifies a new direction in parameter-efficient model adaptation by leveraging explicit rank-space decomposition, targeted parameter freezing, and structured orthogonality to reduce redundancy and improve representation diversity in adapters. While the primary evaluation has centered on RGB-T tracking, a plausible implication is that the methodology of structured rank decomposition and orthogonal grouping could generalize to other settings where low-rank adaptation and parameter efficiency are critical, such as other vision modalities or large text models. GOLA’s design choices and demonstrated empirical advantages motivate further investigation into the principled structuring of low-rank adaptation spaces.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Group Orthogonal Low-Rank Adaptation (GOLA).