Spectral Attention Steering for Prompt Highlighting

Published 1 Mar 2026 in cs.CL and cs.AI | (2603.01281v1)

Abstract: Attention steering is an important technique for controlling model focus, enabling capabilities such as prompt highlighting, where the model prioritises user-specified text. However, existing attention steering methods require explicit storage of the full attention matrix, making them incompatible with memory-efficient implementations like FlashAttention. We introduce Spectral Editing Key Amplification (SEKA), a training-free steering method that tackles this by directly editing key embeddings before attention computation. SEKA uses spectral decomposition to steer key embeddings towards latent directions that amplify attention scores for certain tokens. We extend this to Adaptive SEKA (AdaSEKA), a query-adaptive variant that uses a training-free routing mechanism to dynamically combine multiple expert subspaces based on the prompt's semantic intent. Our experiments show both methods significantly outperform strong baselines on standard steering benchmarks while adding much lower latency and memory overhead, in compatibility with optimised attention.

Abstract PDF Upgrade to Chat

Authors (6)

Summary

The paper introduces SEKA, a training-free framework that modulates key embeddings with spectral projections to enhance focus on user-designated tokens.
SEKA and AdaSEKA outperform traditional post-attention methods like PASTA by reducing computational overhead and improving recall across LLM benchmarks.
The methods maintain semantic integrity while enabling query-adaptive, interpretable attention steering, setting a new standard for efficient language model control.

Spectral Attention Steering for Prompt Highlighting: A Technical Analysis

Motivation and Problem Formulation

Attention steering, specifically prompt highlighting, addresses the challenge of directing LLMs to emphasize user-specified tokens during inference. Traditional attention steering approaches such as PASTA modify the post-computed attention matrix, thereby necessitating full matrix storage. This is fundamentally incompatible with modern memory-optimized implementations like FlashAttention, introducing significant computational and storage overhead and precluding deployment in real-world, efficiency-sensitive pipelines. Furthermore, existing methods like PASTA require costly searches across attention heads, incurring substantial latency and manual configuration burden.

These limitations motivate a new class of techniques that effect control pre-attention, without requiring explicit attention score manipulation. Spectral Editing Key Amplification (SEKA) and its adaptive extension AdaSEKA address this efficiency bottleneck by operating directly on the key embeddings prior to attention computation, using spectrally-learned relevance projections to steer attention in a computationally tractable and interpretable manner.

Methodology

Spectral Editing Key Amplification (SEKA)

SEKA is a training-free framework for attention steering operating by injecting additive, learned projections into the key vectors. The method proceeds in two phases:

Offline Phase: Using synthetic prompt triples (neutral/positive/negative), key embeddings for highlighted tokens are aggregated under varying context-query relevance. Cross-covariance analysis between positive and negative pairs per (layer, head) is performed, followed by SVD. The top singular vectors (positive relevance) and bottom singular vectors (irrelevance) span the relevance and non-relevance subspaces respectively. The constructed projection matrices are stored for each selected (layer, head) pair.
Inference Phase: During forward propagation, for designated highlighted tokens, SEKA applies a hook that edits the key vector via projection in the relevant subspace, amplifying alignment with user-marked content. This yields a structured, algebraically interpretable low-rank perturbation of the attention score logits, with negligible impact on computational efficiency and no requirement to materialize full attention matrices.

Adaptive SEKA (AdaSEKA)

AdaSEKA extends SEKA via a bank of expert projections, each trained from different domain-specific datasets (e.g., factual recall, instruction following, multi-hop reasoning). At inference, the query vector of the prompt’s final token is used to automatically compute alignment scores with each expert's main projection directions. These scores drive a weighted, on-the-fly composition of expert projections for token-level steering. This mechanism introduces query-aware adaptivity, reducing manual hyperparameter tuning and enhancing modularity.

Efficient Selection of Relevance-Sensitive Heads

Empirical analysis of key embedding shifts under prompt relevance demonstrates that only a small subset of heads (primarily in mid-to-late layers) exhibit strong, directionally consistent shifts, in line with prior mechanistic analysis of retrieval heads. SEKA restricts projection to heads exceeding a learned threshold of average pairwise $l_2$ distance, ensuring that steering is targeted and minimally invasive to unrelated computation.

Experimental Results

Standard Benchmarks

Evaluated on CounterFact, Bias in Bios, and a pronoun rewriting instruction-following task, SEKA and AdaSEKA demonstrate superior or state-of-the-art performance across multiple LLM architectures and scales (Qwen3-4B/8B/14B, Gemma3-4B/12B). Notably, SEKA achieves efficacy scores of 99.02 and 98.61 on CounterFact (ES/PS, Qwen3-4B), outperforming PASTA, which, despite strong results, incurs much higher resource consumption. AdaSEKA outperforms all baselines in adversarial settings and tasks requiring dynamic adaptation of relevance semantics.

Ablation studies reveal that random projections or indiscriminate head selection drastically reduce performance, confirming the necessity of both spectral learning of projections and selective targeting.

Lost-in-the-Middle Mitigation

SEKA is able to significantly mitigate, and even invert, the U-shaped recall failure observed in lost-in-the-middle tasks by applying attention steering only to the central passages. Granular analysis shows that proper thresholding of steered heads is critical for flattening the positional recall curve without degrading accuracy at context boundaries.

Overhead and Compatibility

SEKA imposes only ~0.03s latency overhead per sample (batch size 10, Qwen3-8B), versus 1.03s for PASTA. Memory usage is likewise negligible compared to baselines, and the method is fully compatible with FlashAttention and related optimizations. AdaSEKA incurs a moderate increase to enable query-adaptive routing but remains an order of magnitude faster than post-hoc matrix editing strategies.

Theoretical Implications

The key innovation of SEKA lies in the shift from post-attention matrix editing to pre-attention modulation, which not only closes the gap in computational efficiency but yields a more structured and interpretable geometric intervention. By targeting low-dimensional relevance subspaces in key representations, SEKA directly interfaces with the routing mechanism of the transformer, as distinct from semantic manipulation in the value/MLP spaces. This design enforces invariance of semantic content while granting fine-grained control of focus and recall. AdaSEKA’s routing by query-alignment advances methodological flexibility towards practical, real-time model steering in variable task settings.

Future Directions

This paradigm opens several theoretical and practical avenues:

Generalization: Extension to broader classes of intervention (e.g., broader control over model behavior without fine-tuning) and application across diverse LLM architectures.
Automated Head Selection: End-to-end learning or meta-learning approaches for relevance-sensitive head selection.
Integration with Other Control Mechanisms: Harmonizing activation and attention intervention, or composing with programmatic steering (e.g., for safety or style).
Robustness and Unintended Effects: Investigating the long-term interaction between spectral steering and downstream model interpretability and safety.

Conclusion

SEKA and AdaSEKA constitute a technically rigorous, resource-efficient methodology for precise, interpretable attention steering in LLMs. By shifting focus to pre-attention, key-side intervention guided by spectral analysis of context relevance, these methods overcome core inefficiencies of prior strategies while achieving leading empirical performance across several standard benchmarks. Theoretical modularity and strong compatibility with optimized inference frameworks position SEKA/AdaSEKA as practical solutions for user-controllable, long-context LLM deployment and as an important step toward a more responsible and adaptable next generation of LLMs.

Reference: "Spectral Attention Steering for Prompt Highlighting" (2603.01281)

Markdown Report Issue

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Explain it Like I'm 14

What this paper is about (big picture)

This paper is about helping LLMs pay extra attention to the parts of a prompt that a user cares about—like bolded or marked text—without slowing the model down or using lots of memory. The authors introduce two new ways to do this, called SEKA and AdaSEKA, that gently nudge the model’s internal focus before it decides what to look at, so the model notices the highlighted text more.

The main questions the paper asks

Can we make an LLM focus more on user-marked words or phrases (prompt highlighting) in a way that is fast and memory‑efficient?
Instead of changing the attention after it’s computed (which is slow), can we adjust the inputs to attention so the model naturally looks where we want?
Is there a general “direction” inside the model’s representations that corresponds to “this is relevant,” and can we find it without training?
Can we adapt the steering automatically to different kinds of tasks (like fact recall vs. instruction following) without manual tuning?

How the methods work (in everyday language)

First, a quick picture of attention inside an LLM:

Think of each word in the prompt as a student with a name tag (its “key”) and a question paper (its “query”). The model decides who to pay attention to by seeing how well each question matches each name tag. A better match → more attention.

Most older methods changed attention after it was already calculated (like editing a giant scoreboard), which is slow and memory-heavy. This paper edits the name tags before the matching happens. That way, the model naturally gives more attention to the highlighted words.

Here’s the idea behind SEKA and AdaSEKA:

The authors noticed that when a question is relevant to a piece of text, the internal “key” for that text shifts in a consistent way. They measured these shifts by comparing the same text under two conditions: paired with a relevant question vs. with an irrelevant question.
They used a math tool called spectral decomposition (like finding the main directions or “themes” in the data) to discover a “relevance subspace”: directions in the model’s internal space that signal “this token is important for the question.”
SEKA then gently pushes the keys of highlighted tokens toward these relevance directions. Imagine slightly tilting a spotlight so it shines more on the marked words.
AdaSEKA extends this with multiple “experts” (like small toolkits for different tasks such as factual recall or instruction following). At runtime, the model looks at the current question and automatically blends the best experts, so the steering matches the task.

To keep things efficient, they:

Only steer the attention “heads” (think sub-spotlights) that naturally react to relevance, mostly in the middle-to-late layers where retrieval tends to happen.
Work entirely before attention is computed, so they stay compatible with fast attention implementations (like FlashAttention), keeping time and memory low.

What they found and why it matters

Across several tests, the methods worked well and stayed fast:

Standard steering benchmarks:
- CounterFact (resolving knowledge conflicts): SEKA and AdaSEKA reached near‑perfect scores and outperformed prior methods.
- Bias in Bios (predicting occupations from bios): They generally ranked at or near the top across different model sizes.
- Pronoun Changing (following instructions): AdaSEKA often did best, and SEKA helped especially on models that don’t respond well to simple markdown emphasis.
Long context “lost-in-the-middle” problem:
- LLMs often remember beginnings and ends of long texts but forget the middle (a U‑shaped curve). By highlighting the middle with SEKA, they could flip the U‑shape into a peak in the middle, improving recall exactly where models usually struggle.
Efficiency:
- SEKA added about 0.03 seconds per sample on a tested 8B model—very small.
- Competing methods that edit the full attention matrix were much slower and used a lot more memory.
- AdaSEKA added a bit more time for its adaptive routing but was still far more efficient than previous approaches.
Ablations (sanity checks):
- Random, non‑learned directions helped less, showing that the learned “relevance directions” really matter.
- Steering every head without selecting relevance‑sensitive ones hurt performance, confirming the importance of targeting the right parts of the model.

What this means going forward

Practical control without retraining: SEKA and AdaSEKA are “training‑free,” so you can steer an existing model’s focus during use, which is convenient for real applications.
Better use of long contexts: They help models find the right information inside long prompts, especially in the middle sections where models usually struggle.
Fast and memory‑friendly: Because they work before attention is computed, they stay compatible with modern, efficient attention and are cheap to run.
Adaptive and modular: AdaSEKA can automatically choose the right “expert” for the task, reducing manual tuning. New experts can be added over time like plug‑ins.

In short, these methods make it easier and faster to get LLMs to focus on what users highlight, improving accuracy on tricky tasks and long documents without slowing the model down.

View Paper Prompt View All Prompts

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a concise list of unresolved issues and open directions that the paper leaves unexplored, framed to enable concrete follow-up work by future researchers.

Data dependence of spectral subspaces
- Quantify how many contrastive samples per (layer, head) are required to learn stable projections; report variance across different random seeds and datasets.
- Test robustness of learned projections to domain shift (e.g., train on CounterFact-style prompts, apply to legal/medical long-context tasks).
- Assess portability of precomputed projections across model checkpoints within the same family (e.g., Qwen3-Base vs. Qwen3-Chat) and across tokenizers.
Coverage of models and modalities
- Evaluate on additional architectures (e.g., Llama, Mistral, Mixtral, Phi), larger scales (≥70B), and instruction-tuned/chat models to determine generality.
- Test encoder–decoder models and multimodal LLMs to see if key-editing transfers beyond decoder-only text models.
Language and tokenization robustness
- Benchmark multilingual settings and scripts (CJK, Arabic, code-mixed text) to analyze tokenization effects on highlight span alignment and projection efficacy.
- Study sensitivity to different highlight markers and token-boundary misalignments (e.g., markers that split BPE tokens).
Head selection and hyperparameter sensitivity
- Replace the manually tuned l2-distance threshold δ_min with an automatic selection criterion (e.g., validation-driven, Bayesian optimization, or stability-based filters).
- Provide a systematic sensitivity analysis for y (singular value retention), K (top components), and gains g+, g− across tasks and model sizes; characterize failure regimes.
Theoretical grounding of spectral design choices
- Justify SVD on cross-covariance vs. alternatives (e.g., CCA, Fisher discriminants) and empirically compare projection quality and downstream efficacy.
- Validate the assumption that least-significant singular vectors encode “negative” relevance; test if explicit negative datasets or contrastive learning improves separation.
Scope of intervention
- Compare key-side editing to query-side and value-side editing, and to combined interventions; quantify trade-offs and headwise interactions.
- Analyze whether editing keys interferes with induction heads and other known mechanisms (e.g., RoPE/positional circuits) via mechanistic probes.
Dynamics during decoding and KV cache behavior
- Clarify how edited keys integrate with KV caching across generation steps; measure effects on cache reuse, streaming, and prefix-sharing scenarios.
- Determine whether (and when) to reapply/adapt projections as the generation evolves, especially in multi-turn dialogues.
Safety, bias, and fairness implications
- Evaluate whether attention amplification exacerbates demographic or topical biases (beyond accuracy on Bias in Bios) and propose mitigation strategies.
- Assess misuse potential: can highlight markers be abused to bypass refusals or safety filters; add red-team stress tests.
Quality and collateral effects
- Measure side effects on fluency, calibration, verbosity, hallucination rate, and uncertainty estimates (e.g., log-prob calibration), not just task-specific metrics.
- Report latency/throughput trade-offs under quantization, CPU inference, and different batch sizes and sequence lengths.
Generalization beyond evaluated tasks
- Test on diverse long-context tasks (summarization with constraints, multi-hop QA, retrieval-augmented generation, code comprehension) and real-world documents.
- Explore multiple simultaneous highlights, overlapping spans, and conflicting priorities; study compositionality and priority resolution.
AdaSEKA routing and expert bank design
- Provide ablations on the number and granularity of experts, expert construction strategies, and negative transfer between experts; study scaling to large M.
- Evaluate alternative routing signals (e.g., pooled prompt representations, earlier tokens, or cross-layer summaries) and stability of the normalisation in the routing formula.
- Investigate continual addition/removal of experts and catastrophic interference; propose criteria for expert redundancy and pruning.
Region targeting in long contexts
- Develop automatic detectors to identify which parts of the context to steer (instead of manual “middle” selection) and evaluate on open-domain inputs.
- Characterize when steering all passages hurts performance; propose adaptive regional steering policies.
Robustness to adversarial and noisy inputs
- Stress test with adversarial highlights (irrelevant or misleading spans), distractor-heavy contexts, or formatting noise; add detection/guardrails for misuse.
- Examine stability under paraphrases of the highlighted content and under reordering of distractors.
Interpretability of learned subspaces
- Map singular directions to human-interpretable features (entities, coreference, temporal facts); quantify alignment with known attention patterns and retrieval heads.
- Investigate cross-head and cross-layer redundancy and whether a smaller shared projector can replace many per-head matrices.
Scalability and memory profile at extreme settings
- Test very long contexts (e.g., 128K–1M tokens) and many highlights; measure cumulative overhead with large numbers of steered heads and experts.
- Study interaction with memory-saving features (paged attention, CPU offloading) in common inference stacks.
Comparative baselines and reproducibility gaps
- Include additional baselines (e.g., Found-in-the-Middle, newer attention interventions) once available; standardize metric definitions and provide confidence intervals.
- Report run-to-run variance and per-instance analysis to understand heterogeneity of gains.
Transfer and lifespan of projections
- Quantify how projections degrade across continued pretraining, fine-tuning, or RLHF; propose inexpensive update procedures or online adaptation.
Integration with retrieval systems and tools
- Evaluate synergy/conflicts with RAG, memory modules, and tool-use; determine if key editing complements or replaces retrieval scoring and re-ranking.
Ethical and deployment considerations
- Provide guidance for safe deployment policies, provenance of projection matrices, and auditability of steering decisions in regulated settings.

View Paper Prompt View All Prompts

Practical Applications

Overview

This paper introduces SEKA and AdaSEKA—training‑free methods that steer LLMs’ attention toward user‑highlighted tokens by editing key embeddings before attention is computed. They are efficient (compatible with FlashAttention), require no fine‑tuning, and show strong gains on factual recall, instruction following, and long‑context “lost‑in‑the‑middle” scenarios. Below are actionable applications, organized by readiness and linked to sectors, with tools/workflows and feasibility notes.

Immediate Applications

Long‑document QA and RAG pipelines (software, enterprise knowledge management)
- Use case: Invert the “lost‑in‑the‑middle” drop by programmatically highlighting retrieved passages (especially mid‑context) so models answer from the right chunk.
- Tools/workflows: Add a “highlight spans” step in the RAG formatter; insert a SEKA pre‑attention hook in inference servers (e.g., vLLM/TensorRT‑LLM) to boost specified token spans.
- Assumptions/dependencies: Requires access to model internals (key vectors) and ability to modify attention kernels or insert pre‑hooks; works best with open/hosted models (e.g., Qwen, Gemma); needs a span selection heuristic from retriever.
System‑prompt preservation and prompt‑injection resistance (security, platform safety)
- Use case: Prioritize system and safety instructions by highlighting those tokens so they remain influential despite user‑provided adversarial text.
- Tools/workflows: Tag system tokens for steering in chat orchestration layers; expose a “system emphasis” flag in model‑serving APIs.
- Assumptions/dependencies: Mitigates but does not eliminate injection; requires careful gains to avoid over‑steering; needs internal access (not feasible with closed black‑box APIs unless vendor supports).
Contract/legal/compliance review (legal, finance, GRC)
- Use case: Highlight key clauses (e.g., indemnity, SLAs, risk factors) so the model summarizes, compares, and flags deviations with higher recall.
- Tools/workflows: Clause detection + automatic span highlighting; “ClauseFocus” module that routes legal spans into SEKA; batch review with FlashAttention‑compatible serving.
- Assumptions/dependencies: Clause detection quality affects results; must log steering for audit; review by legal experts remains necessary.
Clinical documentation summarization and coding support (healthcare)
- Use case: Emphasize comorbidities, medications, allergies, or time‑critical entries in long clinical notes to improve recall in summaries or ICD/CPT suggestion.
- Tools/workflows: EHR adapters that mark key fields/spans (problem list, labs) for steering; low‑latency on local servers for PHI handling.
- Assumptions/dependencies: Requires on‑prem/self‑hosted models with SEKA integrated; clinical validation and governance needed.
Customer support and ticket triage (CX/CRM)
- Use case: Highlight problem description, error codes, and latest updates to improve resolution suggestions and reduce irrelevant responses.
- Tools/workflows: CRM plugin that extracts salient fields and applies SEKA during generation; logs and A/B testing to tune gains per queue.
- Assumptions/dependencies: Assumes reliable field extraction; monitor for over‑focusing that may suppress useful context.
Code comprehension in IDEs and code assistants (software/dev tools)
- Use case: Emphasize relevant functions/classes/usages in long files or multi‑file contexts to improve explanation, refactor, or test generation.
- Tools/workflows: IDE plugin computes symbol relevance and highlights token spans before calling the model; SEKA hook in local model runtime.
- Assumptions/dependencies: Needs robust static analysis or embeddings to pick spans; performance may vary across languages/frameworks.
Instruction‑following fidelity in multi‑step tasks (education, enterprise workflows)
- Use case: Highlight step constraints or specific rewrite instructions (e.g., pronoun normalization, formatting rules) to reduce ignored instructions.
- Tools/workflows: Prompt builders that mark instruction tokens; AdaSEKA for routing between “instruction‑following” and “factual recall” experts.
- Assumptions/dependencies: Over‑steering can reduce flexibility; gains may depend on base model’s sensitivity to emphasis.
Evidence‑grounded generation and citation fidelity (publishing, research tools)
- Use case: Highlight cited sentences/paragraphs to bias generation toward grounded content and improve citation alignment.
- Tools/workflows: Citation extractor + span selection; SEKA integrated into report/summary generation; QA checks on groundedness.
- Assumptions/dependencies: Quality of citation extraction; does not by itself verify correctness.
Low‑latency/high‑throughput LLM serving (AI infrastructure)
- Use case: Offer “attention steering” as a production feature compatible with FlashAttention, preserving throughput while adding controllability.
- Tools/workflows: Implement SEKA as a pre‑attention kernel/plugin; expose an API for token index lists + gains; reuse provided projection matrices where available.
- Assumptions/dependencies: Engineering integration with inference stack; memory for projection tensors per layer/head; model‑specific calibration.
Interpretability and evaluation probes (academia)
- Use case: Use spectral head selection and L2 shift metrics to identify “retrieval heads” and study relevance encoding across layers.
- Tools/workflows: Apply the provided head‑selection metric; run ablations with/without learned projections; publish steering logs for reproducibility.
- Assumptions/dependencies: Findings may be model‑family specific; requires access to internal activations.
Personal productivity and note assistants (daily life)
- Use case: Users highlight parts of long notes/emails/wikis to steer summaries or action items toward what they care about.
- Tools/workflows: Local LLM apps with “Focus Mode” that maps UI highlights to token spans and applies SEKA; sliders for gain tuning.
- Assumptions/dependencies: Feasible on local/hosted open models; not supported via most closed APIs without vendor cooperation.

Long‑Term Applications

Multimodal and robotics attention steering (vision‑language, robotics, autonomous systems)
- Concept: Extend key‑editing to cross‑attention keys to prioritize image regions, frames, or sensors relevant to a task (e.g., hazard cues or tool targets).
- Potential products: “VisualFocus” for VLMs, sensor‑prioritization modules for robots.
- Assumptions/dependencies: Requires adaptation to multimodal architectures and safety validation; more research on cross‑attention dynamics.
Organization‑specific expert libraries for AdaSEKA (enterprise)
- Concept: Curate internal datasets (policy, SOPs, product docs) to train expert projections and enable automatic query‑aware routing.
- Potential products: “ExpertBank” that updates without model retraining; governance to add/remove experts.
- Assumptions/dependencies: Data engineering for expert creation; storage for per‑layer/head components; periodic evaluation to prevent drift.
Standards for emphasis markup and “attention policy” (policy, standards bodies)
- Concept: Define interoperable document/chat markup (beyond Markdown) and governance rules specifying which sections must be prioritized (e.g., safety, privacy).
- Potential products: Policy‑aware model gateways that enforce attention policies and log steering for audits.
- Assumptions/dependencies: Industry/vendor alignment; procedures for auditability and transparency.
Vendor‑level API support for highlight indices (cloud AI platforms)
- Concept: Expose a first‑class API parameter (token spans + gains) so customers can control attention without custom kernels.
- Potential products: “Highlight‑aware” endpoints with usage analytics and limits.
- Assumptions/dependencies: Requires platform changes; needs safeguards against misuse (e.g., bias amplification).
Training‑time integration and kernel/hardware co‑design (AI infrastructure)
- Concept: Fuse key‑editing into attention kernels; optionally co‑train small controllers to predict spans/gains automatically.
- Potential products: Optimized kernels in FlashAttention/TensorRT‑LLM; hardware primitives for low‑rank key edits.
- Assumptions/dependencies: Engineering and upstream acceptance; careful benchmarking for latency/accuracy trade‑offs.
Automated span selection and adaptive gains (software, research)
- Concept: Learn to detect salient spans (via saliency, retriever scores, or supervision) and adjust g+/g‑ dynamically per head and task.
- Potential products: “AutoFocus” modules that run pre‑inference scanning and produce steering plans.
- Assumptions/dependencies: Risk of instability/over‑steering; needs guardrails and validation.
Fairness‑aware and safety‑aware steering (public sector, regulated industries)
- Concept: Use steering to reduce known context biases (e.g., de‑emphasize spurious cues, emphasize verified attributes) with transparent logs.
- Potential products: Compliance dashboards showing which tokens were boosted/suppressed and why.
- Assumptions/dependencies: Strong governance; independent audits to avoid unintended bias amplification.
Cross‑lingual and domain generalization
- Concept: Build projection banks for multiple languages/domains and route based on query language/domain.
- Potential products: Multilingual helpdesk copilots; domain‑specialized reading assistants.
- Assumptions/dependencies: Requires per‑language/domain projection learning; storage and routing logic.
Steerable agents and multi‑tool workflows (agentic systems)
- Concept: Persist and propagate steering across tool calls to maintain focus on goals, evidence, or constraints.
- Potential products: “Attention plans” embedded in agent state; tool plugins that respect steering metadata.
- Assumptions/dependencies: Coordination across components; standard formats for carrying spans and gains.
Monitoring and observability for attention control (MLOps)
- Concept: Real‑time dashboards and alerts for steering intensity, affected heads, and outcome correlations.
- Potential products: “AttentionOps” suites with experiment tracking and rollback.
- Assumptions/dependencies: Requires logging at token/head granularity and privacy controls.

General Assumptions and Dependencies

Access: Most immediate uses require self‑hosted or open‑weights models that allow pre‑attention key edits; closed APIs need vendor buy‑in.
Portability: Projections are model‑family specific; while the paper releases matrices for some models, new models require a (training‑free) offline SVD step and head selection.
Span identification: Workflows depend on reliable methods (retrieval, heuristics, UI highlights) to pick token spans to steer.
Tuning: Gains (g+, g‑) and head thresholds (δmin) may need light validation per task/model; over‑steering can hurt performance if applied too broadly.
Governance: Steering can change which evidence dominates; logs and audits are recommended in regulated settings to ensure transparency and fairness.

View Paper Prompt View All Prompts

Glossary

Adaptive SEKA (AdaSEKA): A query-adaptive variant of SEKA that blends multiple expert projections at inference time without training to tailor steering to the prompt’s intent. "Additionally, we propose Adaptive SEKA (AdaSEKA), an advanced variant that uses a training-free routing mechanism to dynamically combine mul- tiple expert subspaces based on the prompt's semantic intent."
Attention logits: The pre-softmax scores produced by the attention mechanism for each query-key pair. "This adjustment modifies the attention logits as equation 5, where qi ∈ Rdk is the i-th query vector."
Attention score matrix: The matrix of unnormalised attention scores for all query-key pairs in a layer/head. "Current state- of-the-art methods, such as PASTA (Zhang et al., 2024), operate by editing the attention score matrix after it has been computed."
Attention steering: Techniques that intervene in the attention mechanism to direct model focus to specific tokens. "Attention steering is an important technique for controlling model focus, en- abling capabilities such as prompt highlighting, where the model prioritises user- specified text."
Cross-covariance matrices: Matrices capturing the covariance between pairs of key embeddings from different conditions (e.g., positive vs. neutral). "compute cross-covariance matrices for each transformer layer l and key-value head h:"
Expert projections: Task-specific projection subspaces learned for different domains (e.g., factual recall, instruction following). "additionally, we propose Adaptive SEKA (AdaSEKA), an advanced variant that learns a bank of task-specific 'expert' projections (e.g., for factual recall versus instruction following)."
FlashAttention: An IO-aware, memory-efficient attention implementation that avoids materialising the full attention matrix. "making these methods incompatible with modern, IO-aware implementations like FlashAttention (Dao et al., 2022; Dao, 2024)"
IO-aware implementations: System designs (like FlashAttention) that optimise attention computation by minimising data movement and memory use. "making these methods incompatible with modern, IO-aware implementations like FlashAttention (Dao et al., 2022; Dao, 2024)"
Key embeddings: The vector representations of tokens used as keys in attention; here edited pre-attention to steer focus. "Both methods achieve prompt highlighting by directly editing key embeddings before the attention computation."
KV heads: Key-value attention heads within a transformer that process key/value projections; steering may target a subset of these. "we use an aggressive configuration that steers 175 out of 288 available KV heads."
l2 distance: The Euclidean distance used to quantify shifts between key embeddings under different relevance conditions. "Figure 1 shows the l2 distance between positive and negative key embeddings, averaged over all answer tokens from our synthetic dataset (as defined in Appendix A)."
Latent directions: Principal directions in representation space along which relevance features can be amplified. "SEKA uses spectral decomposition to steer key embeddings towards latent directions that amplify attention scores for certain tokens."
Logit distributions: The distribution of pre-softmax output scores; some methods steer by manipulating these directly. "We also compare with Selective Prompt Anchoring (SPA) (Tian & Zhang, 2025), a prompt highlighting method that operates on the logit distributions of the LLMs."
Lost-in-the-middle setting: A long-context phenomenon where models recall information better at the beginning and end than in the middle. "we introduce an additional experiment targeting positional recall in the challenging lost-in- the-middle setting (Liu et al., 2024)."
Low-rank relevance bias matrix: A structured additive term to attention scores derived from edited keys, expressible as a low-rank matrix. "It is algebraically equivalent to augmenting the original attention score matrix A with a low-rank relevance bias matrix B:"
Multi-head attention: The mechanism that computes attention using multiple parallel heads, each with its own query/key/value projections. "In standard multi-head attention, the unnormalised attention score between query i and key j is Attn(i, j) = gi ki, where qi, kj ∈ Rdk are the query and key vectors"
Positional attention bias: A bias term in attention that depends on token positions rather than content. "positional calibration methods such as Found-in-the-Middle (Hsieh et al., 2024) subtract a baseline from the positional attention bias."
Projection matrix: A matrix used to project key embeddings onto relevance-aligned subspaces for amplification. "These learned directions are then used to construct a projection matrix that amplifies the relevant features of highlighted keys"
Prompt highlighting: Steering the model to prioritise user-specified parts of the prompt for improved focus and recall. "prompt highlighting, where the model prioritises user- specified text."
Query vector: The attention query representation that compares against keys to compute attention scores. "At inference time, we extract the query vector qe,h at layer l and head h of the last token in the prompt"
Query-key inner products: The dot products between queries and keys that determine attention scores. "Since attention depends on query-key inner products, equivalent control can be achieved by editing either representation"
Relevance subspace: A structured subspace capturing directions in key embeddings that correspond to relevance. "we can learn a universal 'relevance subspace' for a given task by applying spec- tral decomposition to key embeddings derived from contrastive prompts."
Routing mechanism: A procedure that dynamically selects or weights expert projections based on the prompt/query. "Additionally, we propose Adaptive SEKA (AdaSEKA), an advanced variant that uses a training-free routing mechanism to dynamically combine mul- tiple expert subspaces based on the prompt's semantic intent."
Singular value decomposition (SVD): A factorisation used to derive principal projection directions and their importance weights. "Singular value decomposition (SVD) is then applied:"
Spectral decomposition: Decomposition of matrices into spectral components used to identify relevance-aligned directions. "SEKA uses spectral decomposition to steer key embeddings towards latent directions that amplify attention scores for certain tokens."
Spectral Editing of Activations (SEA): A prior algorithm for activation-level steering that inspired SEKA’s spectral approach to attention. "The core mechanism of SEKA is inspired by the Spectral Editing of Activations (SEA) algorithm (Qiu et al., 2024)"
U-shaped performance curve: The characteristic pattern where performance is higher at the start/end of context and lower in the middle. "resulting in a characteristic U-shaped performance curve."

Spectral Attention Steering for Prompt Highlighting

Summary

Spectral Attention Steering for Prompt Highlighting: A Technical Analysis

Motivation and Problem Formulation

Methodology

Spectral Editing Key Amplification (SEKA)

Adaptive SEKA (AdaSEKA)

Efficient Selection of Relevance-Sensitive Heads

Experimental Results

Standard Benchmarks

Lost-in-the-Middle Mitigation

Overhead and Compatibility

Theoretical Implications

Future Directions

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

What this paper is about (big picture)

The main questions the paper asks

How the methods work (in everyday language)

What they found and why it matters

What this means going forward

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Practical Applications

Overview

Immediate Applications

Long‑Term Applications

General Assumptions and Dependencies

Glossary

Open Problems

Continue Learning

Collections

Tweets

Don't miss out on important new AI/ML research

Spectral Attention Steering for Prompt Highlighting

Summary

Spectral Attention Steering for Prompt Highlighting: A Technical Analysis

Motivation and Problem Formulation

Methodology

Spectral Editing Key Amplification (SEKA)

Adaptive SEKA (AdaSEKA)

Efficient Selection of Relevance-Sensitive Heads

Experimental Results

Standard Benchmarks

Lost-in-the-Middle Mitigation

Overhead and Compatibility

Theoretical Implications

Future Directions

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

What this paper is about (big picture)

The main questions the paper asks

How the methods work (in everyday language)

What they found and why it matters

What this means going forward

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Practical Applications

Overview

Immediate Applications

Long‑Term Applications

General Assumptions and Dependencies

Glossary

Open Problems

Continue Learning

Collections

Tweets

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research