Learned Task Vectors (LTVs)

Updated 12 October 2025

Learned Task Vectors (LTVs) are low-dimensional representations computed as the parameter difference between fine-tuned and baseline models, enabling task-specific adaptation.
LTVs support arithmetic operations like addition and scaling, which facilitate modular model editing and the composition of multi-task skills.
Their methodology underpins efficient applications in few-shot, federated, and in-context learning, while addressing challenges such as norm disparities and backdoor vulnerabilities.

Learned Task Vectors (LTVs) are low-dimensional representations that encode the specific adaptation or behavior imparted to a machine learning model—typically a deep neural network—through fine-tuning or exposure to task-specific examples. As implemented across diverse domains and methodologies, LTVs have become a central abstraction for efficient model adaptation, modular composition, and interpretable control of task-specific computation in both supervised and unsupervised settings.

1. Definitions and Fundamental Constructions

LTVs are typically constructed as the difference in parameter space between the weights of a fine-tuned model for a specific task and the original pre-trained model. Mathematically, if $\theta_{\mathrm{pre}}$ denotes the pre-trained weights and $\theta_{\mathrm{ft}}^{(t)}$ the weights after fine-tuning on task $t$ , then the canonical task vector is:

$\tau_t = \theta_{\mathrm{ft}}^{(t)} - \theta_{\mathrm{pre}}$

This “delta” vector encodes the parameter-space direction that confers the task-specific capability to the base model (Ilharco et al., 2022, Zhang et al., 3 Jul 2024, Kawamoto et al., 17 May 2025). In the context of in-context learning (ICL), LTVs may alternatively denote a vector extracted from hidden states or as an optimized embedding (e.g., a function of attention head activations or direct gradient-based parameter) that summarizes information from demonstration prompts (Hendel et al., 2023, Saglam et al., 8 Feb 2025, Yang et al., 29 Sep 2025).

A key property is compositionality: LTVs support arithmetic operations—addition, subtraction, and scaling—enabling efficient model editing and modular adaptation. For example, adding a task vector to the base weights configures the model for the target task, while combining multiple task vectors yields a multi-task model:

$\theta_{\mathrm{edited}} = \theta_{\mathrm{pre}} + \sum_{k=1}^K \lambda_k \tau_k$

where $\lambda_k$ scales the influence of each task (Ilharco et al., 2022, Li et al., 15 Apr 2025).

2. Application Modalities and Practical Use Cases

LTVs underpin a suite of efficient adaptation and editing strategies:

Model editing via arithmetic: Addition of an LTV enables the base model to perform the target task; negation or subtraction enacts unlearning or mitigation (e.g., bias removal or concept erasure) (Ilharco et al., 2022, Pham et al., 4 Apr 2024, Naganuma et al., 30 May 2025).
Multi-task composition: Merging multiple LTVs forms a model competent across several tasks. The blockwise scaling in aTLAS enables layer- or block-specific weights for each task, reducing interference (Zhang et al., 3 Jul 2024).
Knowledge transfer and analogical reasoning: LTVs enable analogy-based transfer, where relationships among tasks (e.g., “A is to B as C is to D”) are encoded as vector differences and summed for zero-shot adaptation (Ilharco et al., 2022).
Few-shot and federated learning: In federated settings, each client computes an LTV and a central server aggregates (potentially with modulators to maintain task specificity and communication efficiency), enabling scalable multi-task learning (Tsouvalas et al., 10 Feb 2025).
Robust concept erasure: LTVs facilitate unconditional suppression of unsafe or unwanted concepts in generative models by subtracting a scaled LTV; tuning edit strength via adversarial prompt diversity (Diverse Inversion) maintains robust erasure without excessive utility loss (Pham et al., 4 Apr 2024).

LTVs are also employed for parameter-efficient fine-tuning, serving as “adapters” which can be merged or swapped without full retraining (Zhang et al., 3 Jul 2024).

3. Theoretical Underpinnings and Geometric Properties

Rigorous theoretical analyses demonstrate that LTVs approximate the negative gradient of the task loss when computed after a single epoch of gradient descent ( $\tau_t^{(1)} = -\eta \nabla L_t(\theta_{\mathrm{base}})$ ), and that model merging via LTV addition closely mimics multitask joint training—with explicit second-order error bounds in multi-epoch regimes (Zhou et al., 22 Aug 2025). This establishes a gradient-based rationale for the effectiveness and efficiency of task arithmetic.

Further, task vectors have been shown to enable provable multi-task learning and unlearning guarantees in nonlinear Transformer settings, with explicit conditions for minimax error and domain generalization given task correlation structures (Li et al., 15 Apr 2025). The compositionality and low intrinsic dimensionality of LTVs is leveraged by blockwise scaling (e.g., aTLAS) to mitigate interference and enhance disentanglement during multi-task transfer (Zhang et al., 3 Jul 2024).

In the latent geometric space of ICL, LTVs are found to emerge as hidden state vectors that encode the compressive summary (often as a linear combination) of all in-context demonstrations, propagating through attention OV circuits and subject to rotation and scaling transformations across Transformer layers (Yang et al., 29 Sep 2025, Hendel et al., 2023, Dong et al., 10 Jun 2025, Yang et al., 24 May 2025). Their performance in in-context model steering is governed by their ability to induce high separability and alignment in the query hidden state geometry (Yang et al., 24 May 2025).

4. Mechanisms in In-Context Learning and Task Vector Prompting

In LLMs, ICL is well described by a two-stage mechanism: demonstrations are “compressed” into an LTV, which is then injected or propagated to modulate predictions on the query input (Hendel et al., 2023, Yang et al., 29 Sep 2025). This LTV may be extracted from (or learned via optimization within) intermediate hidden states, attention heads, or directly as a learnable vector. Prevalent methodologies include:

Extraction from pre-computed hidden states: Using the activation at a specific token and layer as the LTV representing the task (Hendel et al., 2023, Yang et al., 16 Jan 2025).
Optimized, directly-trained LTVs: Fitting an explicit LTV by minimizing downstream loss upon injection; these “learned” LTVs outperform extraction-based approaches and remain robust to injection site and prompt variations (Yang et al., 29 Sep 2025).
Attention head combination: Computing the LTV as a weighted sum over head outputs, with head-level weights optimized to maximize alignment with in-context-learned representations (Saglam et al., 8 Feb 2025).
Task vector prompting loss (TVP-loss): Explicit auxiliary objectives force the model to encode all task information at a controlled hidden state location, enhancing zero-shot robustness (Yang et al., 16 Jan 2025).

Several studies establish that the representational capacity of a single LTV is limited: it constitutes a rank-one approximation of the underlying task mapping, and fails for high-rank relations or strongly compositional tasks. Injecting a collection of subtask-specific vectors (multi-vector strategies) yields notable improvements on complex tasks (Tikhonov et al., 29 May 2025, Dong et al., 10 Jun 2025).

5. Limitations, Vulnerabilities, and Fairness Considerations

While LTV-based methods offer efficiency and modularity, they are susceptible to several limitations:

Norm disparities and low-confidence source models: When LTVs have widely differing norms (from disparate fine-tuning schedules or objectives), model merging can fail—dominated by the highest-norm LTV or degraded by low-confidence heads. Pre-conditioning with norm alignment and knowledge distillation (DisTaC) addresses this sensitivity (Yoshida et al., 2 Aug 2025).
Backdoor vulnerabilities: Malicious LTVs can be crafted to inject backdoor behaviors that propagate via model merging, evading common detection tools and persisting across addition, negation, and analogical operations (Hsu et al., 4 Jan 2025).
Representational bottlenecks: Single-task-vector injection is insufficient to represent many composite or functionally high-rank tasks, motivating distributed or multi-vector approaches (Tikhonov et al., 29 May 2025, Dong et al., 10 Jun 2025).
Fairness trade-offs: Direct arithmetic manipulation of task vectors affects group fairness metrics (Demographic Parity, Equalized Odds), and tuning merge coefficients is necessary to balance subgroup equity against accuracy (Naganuma et al., 30 May 2025).

These vulnerabilities and trade-offs highlight the need for rigorous auditing and tailored intervention when using LTV-based adaptation in safety- or fairness-critical contexts.

6. Extensions, Transferability, and Scalability

LTVs are extendable across modalities and training regimes:

Cross-model transfer: When task vectors are transferred between differently pre-trained models, alignment via orthogonal similarity transformations (learned from few labeled samples) preserves norm and rank, enabling modular editing and reuse even with heterogeneous initializations (Kawamoto et al., 17 May 2025).
Federated and many-task learning: Unified aggregation of client- or task-specific LTVs, coupled with lightweight modulators, enables communication-efficient, scalable deployment to clients—particularly in settings with broad task heterogeneity (Tsouvalas et al., 10 Feb 2025).
Parameter-efficient fine-tuning: LTVs serve as compact adapters (in PEFT), and blockwise scaling (e.g., aTLAS) allows adaptation with only a handful of coefficients, conferring rapid adaptation and drastically reduced memory usage (Zhang et al., 3 Jul 2024).

Empirical work demonstrates LTV advantages in vision, text, and multimodal settings, and their integration with LoRA-based or low-rank adaptation approaches for further efficiency (Zhang et al., 3 Jul 2024, Kawamoto et al., 17 May 2025).

7. Outlook, Open Questions, and Impact

LTVs have redefined model adaptation by providing a principled, arithmetic-based abstraction for modular update, model editing, and transfer. They exhibit high performance in multi-task, few-shot, and federated learning settings with minimal retraining.

Ongoing research investigates:

Methods for robustly disentangling and scaling task components during merging (to mitigate negative transfer and norm imbalance) (Zhang et al., 3 Jul 2024, Yoshida et al., 2 Aug 2025).
Secure and fair use, given the susceptibility to backdoor attacks and fairness disparities (Hsu et al., 4 Jan 2025, Naganuma et al., 30 May 2025).
Mechanistic understanding of LTV propagation, OV circuit mediation, and the geometry of their effect within deep networks (Yang et al., 29 Sep 2025, Yang et al., 24 May 2025).
Enhanced strategies for distributed representations, e.g., multi-vector injection and combining head-specific vectors, as required for high-rank or compositional tasks (Tikhonov et al., 29 May 2025, Dong et al., 10 Jun 2025).
Theoretical extension to nonlinear and dynamic settings, with proofs for generalization guarantees and convergence for complex multi-task and continual learning regimes (Li et al., 15 Apr 2025, Bu et al., 13 Aug 2025).

LTVs stand as a unifying concept across model editing, transfer, and in-context computation, offering both practical efficiency and interpretability, while motivating deeper investigation into the mechanisms, limits, and safe deployment of modular task representations in modern machine learning.