Papers
Topics
Authors
Recent
2000 character limit reached

Decoding by Contrasting Layers (DoLa)

Updated 19 December 2025
  • Decoding by Contrasting Layers (DoLa) is an inference-time framework that leverages different neural network layer activations to suppress hallucinations and boost factual accuracy.
  • It computes contrastive token scores by comparing final and premature layer outputs using logit differences, dynamic selection, and entropy-guided weighting.
  • Empirical evaluations show robust accuracy improvements across various tasks and modalities, with extensions for vision-language, multilingual, and sequence models.

Decoding by Contrasting Layers (DoLa) is an inference-time decoding framework designed to improve factuality and reduce hallucinations in large neural networks—primarily language and multimodal models—by leveraging the hierarchical structure and specialization of model layers. Rather than relying solely on final-layer output probabilities, DoLa explicitly contrasts activations or next-token distributions from different layers (or feature depths), using this contrast to suppress spurious, premature, or distorted predictions. The technique has spawned a range of algorithmic variants, broadening its applicability from autoregressive LLMs to encoder-decoder models, multimodal systems, and sequence agents.

1. Core Algorithmic Principles

At its foundation, DoLa utilizes the following formalism:

  • At each decoding timestep tt, compute two distributions: qN()q_N(\cdot) from the deepest (output) layer and qe()q_e(\cdot) from a selected earlier ("premature") layer.
  • Either statically select ee (mid-depth, empirically or by bucket) or dynamically choose it by maximizing the Jensen–Shannon divergence JSD(qN, qe)\mathrm{JSD}(q_N,\ q_e) over a set of candidates.
  • Form a contrastive logit or score for each token vv:

s(v)=logqN(v)logqe(v)s(v) = \log q_N(v) - \log q_e(v)

or, with thresholding on low-probability tokens, restrict to tokens vv such that qN(v)αmaxwqN(w)q_N(v)\ge\alpha\max_w q_N(w); typical values α=0.1\alpha = 0.1.

  • Renormalize and decode by softmax over these scores; optionally incorporate repetition penalties or fuse scores from multiple layers.
  • Variants extend this basic logic to:
    • Contrast fused distributions over multiple uncertain layers, e.g., VaLiD for LVLMs, with entropy-weighting: ωi,texpHi,t\omega_{i,t}\propto \exp H_{i,t} (Wang et al., 24 Nov 2024).
    • Self-contained contrast within encoder-decoder stacks (T5/FLAN-T5), with layer selection and logit evolution analysis (Sun et al., 3 Dec 2025).
    • Adaptive or reinforcement-learning-driven "when to contrast" policies, as in ActLCD (Zhang et al., 29 May 2025).
    • Token-aware or layer-localized perturbation in LayerCake (Zhu et al., 6 Jul 2025).

2. Layer Selection, Fusion, and Reference Construction

Effective deployment of DoLa requires principled selection of "premature" layers and candidate reference signals:

  • Static selection: For mid-sized autoregressive transformers, contrasting the final layer with a mid-depth layer yields strong factual gains (e.g., layers 12 and 24 in GPT2-Medium) (Gera et al., 2023).
  • Dynamic selection: At each timestep, select the earlier layer maximizing the divergence from the output distribution (usually Jensen–Shannon or entropy-based metrics) (Chuang et al., 2023, Banerjee et al., 12 Dec 2025).
  • Bucketed approaches: Group model layers into contiguous buckets to limit computational cost and layer search (Chuang et al., 2023).
  • Fusion mechanisms: In vision models such as LLaVA or InstructBLIP, fuse distributions from multiple high-entropy layers via entropy-normalized weights, then contrast the result against the deepest layer (Wang et al., 24 Nov 2024).
  • Extrapolative decoding: Extrapolate token probabilities beyond the last physical layer via linear models when output layer entropy is abnormally high, addressing the limitation of overconfident final layers (Das et al., 14 Apr 2024).

3. Applications and Empirical Impact

DoLa has demonstrated robust improvements in factuality, faithfulness, and multi-hop reasoning across a wide variety of tasks:

Setting Task/Benchmark Baseline Score DoLa Score Absolute Gain
LLaMA-7B TruthfulQA MC3 19.2% 32.1% +12.9 points
Multi-Hop QA HotpotQA (F1) 19.5 32.6 +13.1
LVLMs POPE (object pres.) varies +6.3 points up to +6.3
Vision-LM MME hallucinatory Baseline < DoLa DoLa ±11/14 tasks n/a
LayerCake TruthfulQA MC1 34.18 37.72 +3.54
LayerCD POPE-COCO random 83.21 85.67 +2.46

Empirical results repeatedly show DoLa yielding double-digit accuracy gains on truthfulness-centric metrics and outperforming vanilla decoding and prior contrastive methods (Contrastive Decoding, Inference-Time Intervention) (Chuang et al., 2023, Murphy et al., 30 Mar 2025, Tong et al., 29 Sep 2025, Wang et al., 24 Nov 2024).

Some applications, such as layer-contrastive decoding in finance (Kang et al., 2023), yield only marginal or negative gains, especially when the performance ceiling is set by model knowledge or when external retrieval would be essential.

4. Algorithmic Variants and Extensions

a) Vision-LLMs

VaLiD and LayerCD generalize DoLa to vision-language systems by contrasting output distributions conditioned on visual features at different depths of the encoder, with entropy-guided layer fusion or adaptive masking (Wang et al., 24 Nov 2024, Tong et al., 29 Sep 2025). Plausibility constraints are used to mask implausible tokens, and entropy weighting or fusion outperforms single-layer reference approaches.

b) Multi-Layer Fusion

Fusion of contrastive signals across two (or more) depths, rather than solely using final–mid-layer contrast, achieves higher truthfulness. LOL (Chen et al., 16 Aug 2024) fuses scores from final and lower layers, further integrates a "truthfulness-refocused" module, and consistently outperforms prior baselines and simple DoLa.

c) Multilingual and Modality-Adaptive

DoLa initially failed on non-English tasks: early layers produce language-mismatched outputs, making amateur logits unreliable. Layer-skipping strategies (SL-H/SL-D) skip context-understanding or purely language-conversion layers to adapt the amateur pass, yielding strong accuracy gains on multilingual reasoning (Zhu et al., 15 Jul 2024).

d) Token-Aware and Hybrid Approaches

In LayerCake (Zhu et al., 6 Jul 2025), model attention analysis identifies specific layers where factual signals for punctuation and conceptual tokens are strongest. Suppressing attention to these tokens at selected depths yields token-localized contrast signals that greatly improve truthfulness and factuality.

e) Sequential Decision and RL-Driven Decoding

Active Layer-Contrastive Decoding (ActLCD) (Zhang et al., 29 May 2025) reframes when to contrast as a reinforcement learning policy, optimizing sequence-level factuality rather than static token decisions. Annotated rewards enforce contrast activation only when factual errors would arise. This achieves higher composite truth scores and lower hallucination rates than static DoLa.

5. Limitations and Contextual Trade-Offs

Several studies identify key limitations and trade-offs:

  • DoLa leverages only the knowledge already present in the pretrained model; if internal representations are hallucinated, DoLa cannot correct them (Chuang et al., 2023, Kang et al., 2023).
  • Factuality gains may come at the expense of creativity. Layerwise probe analyses show DoLa suppresses signals associated with divergent (creative) thinking, making it less suitable for tasks requiring novel hypothesis generation (Banerjee et al., 12 Dec 2025).
  • In seq2seq models, DoLa helps on semantic constraint tasks but harms rigid formatting or strict fluency tasks, as the contrastive signal may override sequence-level constraints (Sun et al., 3 Dec 2025).
  • Computational overhead is modest (1–8% latency or memory in most settings), but methods involving multiple forward passes or fusion (vision models, LayerCake) can double autoregressive decode cost (Tong et al., 29 Sep 2025, Wang et al., 24 Nov 2024).
  • Multilingual and modality-specific deployments require adaptation (layer skipping, dynamic reference selection), as generic amateur–expert signals may be uninformative outside English or text modalities (Zhu et al., 15 Jul 2024, Wang et al., 24 Nov 2024).

6. Implementation and Practical Recommendations

7. Future Directions and Open Problems

Ongoing research targets several avenues for further development:

In summary, Decoding by Contrasting Layers is a versatile, inference-only decoding principle that exploits the emergent semantic hierarchy within neural networks to suppress hallucination and improve factuality. Its generality across modalities (text, vision, sequence), architectures (autoregressive, encoder–decoder), and domains (QA, reasoning, instruction following, multimodal perception) has established it as a foundational element within the next generation of training-free model reliability enhancements.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Decoding by Contrasting Layers (DoLa).