Papers
Topics
Authors
Recent
Search
2000 character limit reached

Distracting Attention Score (DAS)

Updated 1 June 2026
  • DAS is a quantitative metric that evaluates distraction by comparing subject or model behavior against a personalized or ideal baseline.
  • It employs methods like dynamic time warping for sensor alignment in cognitive studies and token probability analysis for LLM responses.
  • Experimental results show that DAS can localize distraction in real time and improve model robustness through careful calibration of hard distractors.

The Distracting Attention Score (DAS) quantifies the degree to which an individual or a model is diverted from an intended focus by extraneous information. Two distinct but technically rigorous implementations of distraction scoring have been proposed: (1) in neurocognitive and human–robot interaction contexts, as a continuous score assessing deviation from a personalized behavioral baseline, and (2) in Retrieval-Augmented Generation (RAG) for LLMs, as a model-dependent score measuring the extent to which irrelevant textual passages induce incorrect, non-abstaining responses. Both employ reference-free, quantitative methodologies and support fine-grained, real-time analysis.

1. Formal Definitions

Cognitive Distraction (Human Behavioral Setting)

Given a driving session jj comprising MM synchronized sensor time series Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M], and a reference baseline session D0D_{0} (no distraction), each time series is partitioned into before (bb), during (dd), and after (aa) segments relative to distraction onset/offset.

The distracted attention score for session jj during the distraction interval is:

DASj=∑i=1Mwi  Aj,idσ0,id\mathrm{DAS}_j = \sum_{i=1}^{M} w_i\;\frac{A_{j,i}^d}{\sigma_{0,i}^d}

where Aj,idA_{j,i}^d is the aligned (by dynamic time warping, DTW) Euclidean distance between segment MM0 of channel MM1 in session MM2 and baseline MM3, MM4 are per-sensor weights, and MM5 normalizes for expected baseline variability. High DAS values correspond to greater overall deviation from baseline behavior during distraction (Garcia-Constantino et al., 2014).

RAG Distracting Effect in LLMs

Given a query MM6, an irrelevant passage MM7, and a LLM, prompting the model with MM8 either elicits an answer or "NO-RESPONSE." The distracting effect score for MM9 regarding Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]0 is defined as:

Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]1

Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]2 quantifies how likely the LLM, when presented with irrelevant context Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]3, forgoes abstention and attempts to answer, with 1 representing maximal susceptibility to distraction (Amiraz et al., 11 May 2025).

2. Methodologies for Computation

Behavioral (Human) DAS

  • Baseline Learning: Each subject completes a distraction-free drive, Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]4, generating reference time series.
  • Dynamic Alignment: For each session, DTW aligns Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]5 with Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]6 to account for speed and timing variability.
  • Distance Calculation: For each segment Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]7 and sensor Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]8, compute Dj[1],…,Dj[M]D_{j}[1], \ldots, D_{j}[M]9.
  • Aggregation: DAS is the weighted, variance-normalized sum of channel-wise distances during distraction.
  • Real-time Capability: Aggregation may be computed in sliding windows for fine-grained, temporally localized assessment.

LLM Distracting Effect

  • Prompt Construction: Query and passage are directly concatenated, instructing the LLM to reply only if the passage contains an answer; otherwise, emit "NO-RESPONSE."
  • Token Probability Extraction: The LLM is run in log-probabilities mode to compute the probability of emitting "NO-RESPONSE" as the first completion token.
  • Score Computation: D0D_{0}0, where D0D_{0}1 is the LLM's probability of "NO-RESPONSE."
  • Thresholds: Passages with D0D_{0}2 are "hard" distractors; those with D0D_{0}3 are "weak" distractors.
  • Candidate Selection: In pipeline use, DE is computed for pools of retrieved or generated passages, enabling ranking by distractive potential.

3. Comparison of Approaches

Domain Baseline Required Aggregation Personalization
Behavioral DAS Yes (per subject) Multi-sensor Baseline adaptation
LLM Distracting Effect No (model only) Per-passage Model-dependent

Behavioral DAS necessitates individualized baselines for meaningful deviation assessment and incorporates alignment to compensate for behavioral variability, supporting sensor weighting and continual calibration. LLM distracting effect is computed per query-passage-model and reflects model-specific susceptibilities, enabling passage-wise ranking without external data.

4. Experimental Findings and Practical Implications

  • Sensor Modalities: Speed, brake, accelerator, clutch, steering, engine RPM, lane position, heart rate.
  • Validation: Statistically significant DAS rises (20–60%) in heart rate, speed, brake under distraction versus baseline.
  • Sensitivity/Specificity: Certain sensors (heart rate, speed) highly sensitive; others (RPM) less so, underscoring multi-sensor aggregation.
  • Temporal Localization: Sliding DAS correctly localizes major deviation intervals (e.g., phone calls).
  • Datasets: Natural Questions, TriviaQA, PopQA, WebQuestions (1K–2K queries each).
  • Models: Llama-3.2-3B, Llama-3.1-8B, Llama-3.3-70B, Falcon-3/7B, Qwen-2.5-3/7B.
  • Inter-model Consistency: Spearman correlation D0D_{0}40.8 for DE across model pairs.
  • Retrieval/Generation: Answer-skewed retrievers and generator-based distractors systematically mine hard distractors.
  • Performance Impact: Augmenting gold passage with a hard distractor reduces QA accuracy by 6–11 points; fine-tuning LLMs with hard distractors increases accuracy by up to 7.5%.
  • Concrete Cases: Hard distractors can systematically induce confident but incorrect responses in LLMs across diverse queries.

5. Integration and Applications

Behavioral DAS

  • Real-time Monitoring: Supports alert generation during driving for personalized cognitive distraction detection.
  • Sensor Fusion: Combination of sensor channels, with per-channel weights tunable by past performance, enhances detection and robustness.
  • Baseline Adaptation: Baseline series can be continually updated per subject for longer-term reliability.
  • Personalization: Fine-tuning weights and smoothing parameters by subject improves specificity.

LLM Distracting Effect

  • RAG Fine-Tuning: Systematic inclusion of hard distractors during LLM training increases robustness against irrelevant context.
  • Negative Candidate Mining: Answer-skewed retriever modifications effectively surface distractive negatives.
  • Pipeline Use: DE computation is log-probability-based, requiring D0D_{0}5 additional forward passes for D0D_{0}6 candidates.
  • Model-Specific Calibration: Distracting effect is model-dependent, allowing tailored evaluation and hard-negative mining per deployed LLM.

6. Technical Considerations and Limitations

  • Cost: Behavioral calculation requires multi-channel DTW and ongoing sensor data; LLM-based DE computation incurs extra forward passes proportional to candidate set size.
  • Normalization: Variance normalization is critical to render DAS values comparable across sensors/subjects (behavioral) and across passages/models (LLM).
  • Interpretation: Both scores are reference-free; behavioral DAS relies on personal baselines, while LLM distracting effect is calibrated solely by model's abstention probabilities in context.
  • No Single Gold Standard: In both settings, no individual feature or passage reliably predicts distraction—composite, multi-stream aggregation is essential for sensitivity and specificity.

7. Outlook and Relation to Broader Research

Both implementations of the Distracting Attention Score reflect a broader trend in measurement science toward data-driven, reference-efficient, and model-aware quantification of context-induced performance degradation. In behavioral settings, this enables personalized feedback and adaptive intervention. In information retrieval and language modeling, it underpins robust negative sampling and systematic benchmarking of model susceptibility to irrelevant or misleading context. These frameworks are extensible to other attention-critical domains, provided requisite baseline, alignment, or abstention-detection mechanisms are available.

A plausible implication is that similar metrics, integrating per-feature or per-passage deviation from a normative baseline, could be deployed to calibrate, monitor, or stress-test autonomous agents operating in high-noise, high-stakes environments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Distracting Attention Score (DAS).