TSV-Compress: Efficient Model Compression

Updated 28 April 2026

TSV-Compress is a model compression technique that leverages truncated SVD on layer-task matrices to reduce storage by up to 90% while preserving at least 99% of original task accuracy.
The method approximates weight differences between fine-tuned and pretrained models using a rank-truncated SVD, ensuring 99% energy retention with minimal singular values.
Empirical results on ViT-B-32 demonstrate that TSV-C achieves near-original accuracy across multiple tasks with significant reductions in per-task storage needs, facilitating efficient model merging.

TSV-Compress (TSV-C) is a model compression technique designed to reduce the storage and computational requirements of per-task fine-tuned neural network weights, while preserving accuracy. TSV-C leverages the observed low-rank structure of "layer-task matrices," compressing them to approximately 10% of their original size with minimal accuracy degradation, typically retaining at least 99% of original task performance. TSV-C is integral to pipelines such as model merging, where per-task parameter changes need to be efficiently represented and combined (Gargiulo et al., 2024).

1. Definition and Mathematical Formulation

For a pretrained model backbone and a given downstream task $\tau$ , let $\theta_{pre}$ be the pretrained weights and $\theta_{ft}(\tau)$ the fine-tuned weights. At a specific layer $t$ (where the weights are naturally a matrix, such as those in fully-connected or convolutional layers), the layer-task matrix is defined as

$M_\tau^t \equiv \Delta_\tau^t = \theta_{ft}(\tau)^t - \theta_{pre}^t,$

where $M_\tau^t \in \mathbb{R}^{m \times n}$ . For layers not natively represented as matrices, TSV-C defaults to ordinary Task Arithmetic without compression (Gargiulo et al., 2024).

2. Algorithmic Procedure and SVD-Based Compression

The core of TSV-Compress is the use of truncated singular value decomposition (SVD) to approximate each $M_\tau^t$ :

$M_\tau^t = U_\tau^t \Sigma_\tau^t (V_\tau^t)^T,$

with $U_\tau^t \in \mathbb{R}^{m \times k}$ , $\Sigma_\tau^t \in \mathbb{R}^{k \times k}$ , $\theta_{pre}$ 0, $\theta_{pre}$ 1. To produce a compressed approximation, TSV-C forms a rank- $\theta_{pre}$ 2 truncated version:

$\theta_{pre}$ 3

where $\theta_{pre}$ 4 is the minimal value such that

$\theta_{pre}$ 5

ensuring at least 99% Frobenius norm (energy) retention. Simultaneously, TSV-C enforces $\theta_{pre}$ 6, so at most 10% of singular components are preserved, capping storage usage at 10% of the original per-layer parameters (Gargiulo et al., 2024).

3. Implementation Details

The TSV-Compress procedure is as follows (summarized in pseudocode):

$t$ 3

At inference, the compressed $\theta_{pre}$ 7 is reconstructed for each layer as

$\theta_{pre}$ 8

and the modified layer weights are

$\theta_{pre}$ 9

with $\theta_{ft}(\tau)$ 0 by default (Gargiulo et al., 2024).

Storage per layer is reduced from $\theta_{ft}(\tau)$ 1 parameters to $\theta_{ft}(\tau)$ 2, which, given $\theta_{ft}(\tau)$ 3, guarantees storage cost is at most $\theta_{ft}(\tau)$ 4 of the original for each eligible layer.

4. Empirical Compression Performance

Empirical results on the ViT-B-32 architecture are summarized as follows:

Method	8 tasks	14 tasks	20 tasks
Finetuned (100%)	92.83 (100)	90.88 (100)	91.37 (100)
TALL-Mask + TIES	93.13 (100.4)	90.92 (100)	91.11 (99.7)
TSV-C (Ours)	92.62 (99.7)	90.29 (99.3)	90.64 (99.1)

Subscripts represent normalized accuracy (percentage of original). For all scenarios, TSV-C uses approximately 10% of per-task parameter storage, retaining at least 99% of original accuracy (Gargiulo et al., 2024).

5. Computational Complexity and Practical Considerations

SVD Complexity: The standard per-layer SVD operation has complexity $\theta_{ft}(\tau)$ 5, but can be accelerated to $\theta_{ft}(\tau)$ 6 with randomized SVD methods, which are suitable due to the small retained rank $\theta_{ft}(\tau)$ 7.
Storage: For a weight matrix of size $\theta_{ft}(\tau)$ 8, storing decomposed forms at rank $\theta_{ft}(\tau)$ 9 requires

$t$ 0

parameters per layer, maximizing at $t$ 1 for $t$ 2.

Implementation Tips:
- Batch SVD computations across layers or tasks using GPU libraries, such as torch.linalg.svd.
- Employ randomized or truncated SVD to manage compute cost for large matrices.
- Store singular factors contiguously as three tensors per layer for efficient reconstruction.
- Preallocate and reuse workspace buffers to minimize GPU memory fragmentation.
- For multi-task scenarios with a shared backbone, cache the SVD of the pretrained weights to reduce computation (Gargiulo et al., 2024).

6. Integration and Use in Model Merging

TSV-Compress is designed for seamless integration into model merging pipelines, where low-rank, compressed task-specific deltas can be reconstructed on demand and combined with the pretrained backbone. This is particularly valuable in settings where many tasks share a backbone, as only the compressed singular factors (Task Singular Vectors) for each delta need to be stored and transmitted. The existence of low-rank layer-task matrices also supports improved model merging methods, such as TSV-Merge, which leverage singular vector interactions to reduce task interference (Gargiulo et al., 2024).

7. Comparative Context and Applications

TSV-Compress offers a compact and accurate alternative to existing compression and masking schemes such as TALL-Mask + TIES, maintaining competitive or superior accuracy with an explicit mathematical guarantee on energy retention and storage footprint. Its applicability is circumscribed to layers naturally represented by matrices (e.g., fully-connected, convolutional structures), with ordinary task arithmetic fallback otherwise. The demonstrated results substantiate its utility for scalable multi-task adaptation, federated settings, and efficient model deployment (Gargiulo et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

Task Singular Vectors: Reducing Task Interference in Model Merging (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to TSV-Compress (TSV-C).