Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 57 tok/s Pro
GPT-5 Medium 28 tok/s
GPT-5 High 38 tok/s Pro
GPT-4o 100 tok/s
GPT OSS 120B 461 tok/s Pro
Kimi K2 208 tok/s Pro
2000 character limit reached

Reasoning Vectors in AI

Updated 5 September 2025
  • Reasoning vectors are explicit vector representations that encapsulate reasoning skills within parameter, activation, or semantic spaces.
  • They are extracted by contrasting model states—using paired models or activation differences—to isolate the contribution of reasoning improvements.
  • These vectors enable modular, controllable enhancement of reasoning capabilities, supporting transfer across domains, languages, and differing neural architectures.

Reasoning vectors constitute a family of approaches for representing, transferring, and modulating reasoning capabilities in artificial intelligence systems—especially in LLMs and structured neural architectures—by encoding reasoning “skills” or features as explicit elements (or directions) in parameter, activation, or semantic vector spaces. These approaches enable reasoning to be extracted, empirically analyzed, and transferred between models, supporting efficient capability enhancement, modular composition, interpretability, and fine control over reasoning behavior.

1. Definitions and Theoretical Foundations

The overarching concept of a reasoning vector is to isolate and represent the effect of reasoning-specific improvements or behaviors using well-defined vectorial objects in the space where models operate: this may include model weight space, activation (residual stream) space, semantic embedding spaces, or symbolic high-dimensional spaces.

Distinct Notions of Reasoning Vectors

Formulation Context Definition/Construction Primary Use
Task arithmetic in parameter space vreason=θGRPOθSFTv_{\text{reason}} = \theta_{\text{GRPO}} - \theta_{\text{SFT}} Post-training knowledge transfer
Steering in activation space h=h+λrh' = h + \lambda \cdot r (directional/steering vector rr) Modulating reasoning behavior
Semantic and symbolic vector spaces e1+e2-e_1 + e_2 (deductive reason), tensor/algebraic binding in VSA Representing logical relations
Control vectors in LLM layers cc_\ell (average or contrast-dependent residual vectors) Enhancing reasoning at inference

In parameter-space arithmetic, as in (Zbeeb et al., 1 Sep 2025), reasoning vectors are computed by subtracting the parameters of a supervised model from those of a reinforcement-learned model on the same base, thereby isolating the delta corresponding to reasoning improvements. In activation or residual space, reasoning is encoded as a direction or vector (steering vector) whose magnitude and application can modulate the length or structure of model-generated reasoning (Sheng et al., 10 Jun 2025, Liu et al., 18 Jun 2025, Venhoff et al., 22 Jun 2025). Other interpretations include concept vectors or control vectors in internal states, as used for fine-grained behavioral interventions (Opiełka et al., 5 Mar 2025, Højer et al., 28 Apr 2025).

2. Construction and Extraction of Reasoning Vectors

Model Parameter-Based Extraction

The extraction process for reasoning vectors in parameter space typically requires two models, initialized identically and trained on the same dataset, with the only distinction being the application of a reasoning-focused optimization phase (such as RL or advanced chain-of-thought fine-tuning) to one:

  • Given θpre\theta_{\text{pre}} (pretrained or instruction-tuned weights) and θpost\theta_{\text{post}} (post-trained with reasoning optimization), the reasoning vector is vreason=θpostθprev_{\text{reason}} = \theta_{\text{post}} - \theta_{\text{pre}} (Oguchi et al., 4 Aug 2025, Zbeeb et al., 1 Sep 2025).
  • This vector is then added to the parameters of a compatible target model: θt,new=θt+αvreason\theta_{t, \text{new}} = \theta_t + \alpha v_{\text{reason}}.

When the vector is masked or selectively applied (e.g., to particular layers or modules), a binary or continuous mask mm is used: θt,new=θt+α(mvreason)\theta_{t,\text{new}} = \theta_t + \alpha (m \odot v_{\text{reason}}).

Activation Space and Steering

In the residual stream or latent space, reasoning vectors are captured as steering directions (Sheng et al., 10 Jun 2025, Liu et al., 18 Jun 2025, Venhoff et al., 22 Jun 2025):

  • The vector is often extracted by averaging the difference in activations between chains with and without reasoning (e.g., chain-of-thought contaminated versus direct response prompts), yielding hsteerh_{\text{steer}}.
  • For targeted behavioral control, difference-of-means techniques, linear PCA components, or contrastive activation-based extraction techniques are used (Højer et al., 28 Apr 2025, Venhoff et al., 22 Jun 2025).
  • The steering vector is then applied at inference by addition and scaling: h=h+λhsteerh' = h + \lambda h_{\text{steer}}, with the scaling λ\lambda used to modulate behavior intensity (fractional reasoning (Liu et al., 18 Jun 2025)).

Semantic and Symbolic Reasoning Vectors

Semantic reasoning is performed by embedding logical implications or relations within vector arithmetic, as in e1+e2-e_1 + e_2 for a fact “e1e_1 implies e2e_2” (Summers-Stay, 2017), or using tensor product and outer product representations (Lee et al., 2015). In vector symbolic architectures (VSA) (Mejri et al., 13 Nov 2024, Sun et al., 21 Jan 2025), high-dimensional vectors and specialized algebraic operations encode compositional relational rules and multidimensional abstraction.

3. Functional Role and Empirical Observations

Transfer and Enhancement of Reasoning Capabilities

  • Adding a reasoning vector to a base model has been shown to consistently enhance reasoning performance on diverse benchmarks (e.g., GSM8K, HumanEval, SciQ, BigBenchHard) (Zbeeb et al., 1 Sep 2025), with typical gains in the 1.5B parameter regime ranging from +1.7% to +12.3% depending on task.
  • Performance degradation observed when the vector is subtracted (θbasevreason\theta_{\text{base}} - v_{\text{reason}}) confirms that the vector encodes critical reasoning capability.

Modulation and Fractional Control

  • Reasoning vectors as activation steering directions enable continuous control of reasoning “depth” (number and richness of intermediate tokens) (Sheng et al., 10 Jun 2025, Liu et al., 18 Jun 2025). Linear probes reliably predict reasoning length from initial activations, and steering modifies it causally.
  • Practical applications include overthinking detection, dynamic adjustment for input complexity, and efficient reasoning by adapting reasoning trace length per instance.

Robustness, Generalization, and Modularity

  • Vectors extracted via arithmetic are robust to adversarial input perturbations (digit noise, sentence shuffle, adversarial tasks), with maintained or even expanded performance advantages (Zbeeb et al., 1 Sep 2025).
  • Reasoning vectors can be transferred cross-lingually (from English to Japanese LLMs) by direct weight transfer without costly retraining (Oguchi et al., 4 Aug 2025).
  • The modularity principle is supported by the vector addition/subtraction paradigm: in principle, multiple skill vectors can be composed for composite capability enhancement (Zbeeb et al., 1 Sep 2025).

4. Mathematical and Algorithmic Formulation

Core Equations

  • Parameter arithmetic: v=θpostθpre,θtarget,new=θtarget+αvv = \theta_{\text{post}} - \theta_{\text{pre}},\qquad \theta_{\text{target,new}} = \theta_{\text{target}} + \alpha v.
  • Activation steering: h=h+λhsteerh' = h + \lambda h_{\text{steer}}.
  • Masked transfer: θtarget,new=θtarget+α(mv)\theta_{\text{target,new}} = \theta_{\text{target}} + \alpha (m \odot v).
  • Linear probe for reasoning length: y^=H(l)W(l)+b(l)\hat{y} = H^{(l)} W^{(l)} + b^{(l)}.
  • Reasoning strength planning intervention: h(l)=h(l)+λr(l)h'^{(l)} = h^{(l)} + \lambda r^{(l)}.

Extraction Algorithms

  • Difference-of-means on paired activations, PCA on contrastive differences, and algebraic difference at the parameter level are the main extraction routines (Højer et al., 28 Apr 2025, Venhoff et al., 22 Jun 2025, Zbeeb et al., 1 Sep 2025).
  • In symbolic arcs, vectorized logical operations (such as e1+e2-e_1 + e_2, tensor product, or VSA binding/bundling) encode reasoning steps as compositional vectors. Deductive and analogical chains are composed by vector addition, e.g., (e1+e2)+(e2+e3)=e1+e3(-e_1 + e_2) + (-e_2 + e_3) = -e_1 + e_3 (Summers-Stay, 2017).

5. Practical Applications and Impact

Reasoning Enhancement without Expensive Training

  • Reasoning vectors allow performance gains without further supervised or RL optimization, as demonstrated across multiple models and benchmarks. The procedure requires only two tensor operations for extraction and application, making it computationally inexpensive and practical for model enhancement/recycling (Zbeeb et al., 1 Sep 2025, Oguchi et al., 4 Aug 2025).

Interpretability and Behavioral Control

  • Explicit steering enables fine-grained modulation of reasoning processes, targeting behaviors such as uncertainty estimation, example generation, and backtracking (Venhoff et al., 22 Jun 2025). This supports controlled generation and interpretability—a model’s tendency to express a desired aspect of reasoning can be increased or suppressed by manipulating a linear direction in activation space.
  • Analytical techniques such as logit lens and attribution patching reveal interpretable structure in how these vectors “boost” groups of tokens (e.g., logical connectors, correctness terms) and modulate reasoning trace (Sinii et al., 24 May 2025, Venhoff et al., 22 Jun 2025).

Language and Domain Adaptation

  • The vector approach enables language-agnostic transfer of complex reasoning skills, as in the direct enhancement of Japanese LLMs with reasoning vectors extracted from English models, bypassing annotation bottlenecks and data scarcity (Oguchi et al., 4 Aug 2025).
  • Similar strategies can, in principle, be extended to broader domains (multi-domain or cross-family transfer), although more systematic paper is needed (Zbeeb et al., 1 Sep 2025).

6. Limitations and Future Directions

  • The effectiveness of reasoning vector transfer assumes close compatibility in architecture and initial conditions between source and target models, as well as careful calibration of the scaling constant (α\alpha) and potential masking (Zbeeb et al., 1 Sep 2025).
  • Overapplication (overscaling) may be detrimental, and the method requires reliable extraction of “reasoning states” or suitable contrasts (e.g., correct reasoning traces for activation-based vectors (Højer et al., 28 Apr 2025, Sheng et al., 10 Jun 2025)).
  • Open questions remain regarding the composability of multiple skill vectors, the transferability across highly dissimilar architectures, and the alignment of vector-induced behaviors with intended reasoning styles or robustness constraints.

Research directions include:

  • Dynamic, per-instance adaptive reasoning vector scaling (Liu et al., 18 Jun 2025);
  • Integrated selection or masking per layer or module to improve transfer precision (Zbeeb et al., 1 Sep 2025);
  • Modular arithmetic with multiple vectors for composite skills;
  • Application in broader settings, including domain adaptation, safety alignment, and interpretability-driven design.

7. Relation to Logic, Symbolic Methods, and Neuro-Symbolic Reasoning

Reasoning vectors also provide a bridge between distributed vector representation and classical reasoning:

  • Tensor product and vector symbolic architectures (VSA) encode fine-grained logical and relational knowledge, supporting systematic, interpretable, and reversible manipulations (Lee et al., 2015, Sun et al., 21 Jan 2025, Mejri et al., 13 Nov 2024).
  • In symbolic mathematical domains, vectors serve to document proof dependencies (proof vectors), supporting clustering, visualization, and metric-based analysis of reasoning structure (Yoo, 31 Mar 2025).

By converting symbolic, logical, or procedural reasoning into vectorial entities, reasoning vectors underpin a broad suite of approaches for endowing models with explicit, transferable, and controllable reasoning abilities across architectures, domains, and languages.