Context-AwarE Sampling (CAES)

Updated 4 July 2026

Context-AwarE Sampling (CAES) is a family of mechanisms that tailors sampling decisions based on contextual states rather than fixed schedules.
It adapts key parameters—such as sampling frequency, window size, and selection index—to optimize performance in applications like video keyframe selection and sensor data acquisition.
Empirical results across various domains show that CAES improves efficiency, risk management, and decision quality by dynamically aligning sampling policies with current context.

Searching arXiv for the cited and closely related CAES papers to ground the article in current literature. Context-AwarE Sampling (CAES) denotes sampling or selection policies whose decisions are conditioned on contextual state rather than on uniform schedules, fixed windows, or context-oblivious heuristics. Within the literature considered here, the term is explicit in VADER, where $K=\mathrm{CAES}(S)$ maps a dense framewise anomaly-score sequence to a compact set of pre-event, on-event, and post-event keyframes (Cheng et al., 10 Nov 2025). Closely related work applies the same principle to contextual bandits, adaptive sensing, time-series domain adaptation, speculative decoding, crowdsensing, knowledge-graph completion, and activity-recognition query selection, which suggests that CAES is best understood as a cross-domain design pattern for context-conditioned data acquisition, subsampling, or proposal construction (Bouneffouf, 2014, Huang et al., 12 Apr 2025, Lai et al., 2023, Zhang et al., 11 Oct 2025, Nguyen et al., 2017, Gul et al., 25 Feb 2025, Civitarese et al., 2019).

1. Definition and conceptual scope

Within the cited work, CAES is not a single standardized algorithm. In some papers it is an explicit module name, as in VADER’s keyframe selector; in others it is an exact or near-exact characterization of the underlying method, as with Freshness-Aware Thompson Sampling for recommendation, DQN-based adaptive sampling for multi-sensor systems, and context-aware window selection for time-series domain adaptation (Cheng et al., 10 Nov 2025, Bouneffouf, 2014, Huang et al., 12 Apr 2025, Lai et al., 2023). This suggests that CAES is best interpreted as a family of mechanisms in which contextual variables alter either the probability of sampling, the support of the proposal distribution, the size of the sampled subset, or the timing of human queries.

The defining feature is therefore not the application domain but the dependence of the sampling rule on state. That state may be a semantic situation $S^t$ , a framewise anomaly score sequence $S$ , an MDP state $s_t$ , a dynamic contact graph $G_{t_s}$ , or a decoder context $c$ . Across these formulations, CAES serves one or more of four functions: enforcing relevance, preserving causal or temporal structure, managing risk or user burden, and reducing computational or energy cost.

2. Recurrent formulation across domains

Across the literature, CAES appears in several technically distinct but structurally homologous forms.

Setting	Context source	Sampled object
Video anomaly understanding (Cheng et al., 10 Nov 2025)	Framewise anomaly scores and score gradients	Keyframe indices $K$
Context-aware recommendation (Bouneffouf, 2014)	Semantic situation, risk, and freshness	Documents
Crowdsensing (Nguyen et al., 2017)	Contact graph, node observability, utility	Sensing devices
Multi-sensor adaptive sampling (Huang et al., 12 Apr 2025)	Observation values, remaining power, history	Sampling actions/frequencies
Time-series domain adaptation (Lai et al., 2023)	Source/target encoder states	Context-window sizes
Speculative decoding (Zhang et al., 11 Oct 2025)	Current decoding context	Vocabulary shortlist
Knowledge-graph completion (Gul et al., 25 Feb 2025)	Head, relation, and tail neighborhoods	High-density neighbors

Taken together, these formulations suggest a recurring CAES pipeline: observe or infer context, score candidate items or windows, enforce a budget or quota, and pass the retained subset to a downstream estimator, learner, or verifier. The sampled object varies—documents, keyframes, sensors, windows, token clusters, or graph neighbors—but the purpose is stable: improve efficiency, causal coverage, or decision quality under computational or behavioral constraints.

3. Temporal and causal context selection

In VADER, raw frames $I_1,\dots,I_T$ are first scored by a CLIP-based anomaly scorer with

$S_i = \max_c \big(p_A(I_i)\cdot p_{c|A}(I_i)\big),$

after which CAES returns $K=\mathrm{CAES}(S)$ . Anomalous intervals are detected using a per-video adaptive threshold at the 97th percentile of $S^t$ 0; pre-event and post-event boundaries are then expanded using score slopes computed over a 5-frame window with rise and calm thresholds at the 95th and 85th percentiles, respectively. The implementation uses a maximum context window of 30 frames, samples 4 pre-event, 8 on-event, and 4 post-event frames per event, and fills the remaining slots with background frames up to 64 total (Cheng et al., 10 Nov 2025).

This temporal partitioning is explicitly causal: pre-event frames capture lead-up, on-event frames the anomalous action, and post-event frames consequences. On HAWK anomaly description, full VADER reports BLEU 0.718, ROUGE 0.283, Reasonability 0.428, Detail 0.442, and Consistency 0.357, whereas removing CAES yields BLEU 0.594, ROUGE 0.244, Reasonability 0.387, Detail 0.435, and Consistency 0.331. Against alternative keyframe selection strategies, CAES reaches BLEU 0.668 and ROUGE 0.274 with Reasonability 0.419, Detail 0.477, and Consistency 0.343, exceeding uniform sampling, Top- $S^t$ 1 score frames, and ATS on the judge-based metrics. The dynamic-window version also slightly exceeds fixed-window and exponential-interval variants.

A second temporal instantiation appears in time-series domain adaptation, where the sampled object is not frames but source and target context-window sizes. ContexTDA defines state $S^t$ 2, action $S^t$ 3, and reward

$S^t$ 4

The learned policy selects window sizes per time step and per domain rather than imposing a single global window. On SMD, ContexTDA averages 0.63 Macro-F1 and 0.75 AUC, compared with 0.58/0.67 for AE-LSTM and 0.56/0.69 for RandContexTDA; on Boiler it reaches 0.50/0.64 (Lai et al., 2023). These results suggest that temporal CAES can function either as narrative keyframe shaping or as adaptive context-length selection for transfer.

4. Risk-, uncertainty-, and burden-aware CAES

Freshness-Aware Thompson Sampling frames user-document interaction in a Context-Aware Recommender System as a situation bandit. Situations are semantic triples $S^t$ 5, risk is aggregated as

$S^t$ 6

memory retention is

$S^t$ 7

and the context-aware selection index is

$S^t$ 8

with

$S^t$ 9

$S$ 0, and $S$ 1. In a 28-day study with 3500 users receiving 10 documents per session, FA-TS attains AP 0.6542 versus 0.4950 for standard Thompson Sampling while maintaining similar ATSD (Bouneffouf, 2014). The intended policy is “safe exploration when risk is low, conservative exploitation when risk is high.”

CAVIAR applies the same context-conditioned selection logic to label acquisition. An Online Random Forest first outputs $S$ 2; semantic reasoning over semantic location, proximity to transportation routes, time of the day, and related context removes context-inconsistent activities and renormalizes the remaining probabilities to $S$ 3. If $S$ 4 with $S$ 5, the segment is self-labeled and added to the model; if $S$ 6 with $S$ 7, the user is queried. On 26 subjects and 14 activities, average F1 rises from 0.64 without context and 0.81 with context as features to 0.88 with CAVIAR, while the percentage of triggered queries drops from about 40% and 35% to about 6% (Civitarese et al., 2019).

These results suggest that in recommendation and active learning, CAES is not chiefly about lowering sample count in the abstract. Its function is to redistribute exploration, pseudo-labeling, and human queries toward contexts where they are least disruptive or most informative.

5. Resource- and coverage-driven adaptive sensing

Resource-aware CAES is explicit in multi-sensor adaptive acquisition. The DQN formulation uses an MDP $S$ 8 whose state includes the current observation value, remaining power, and historical sampling records of each sensor, and whose reward is

$S$ 9

balancing information gain, energy consumption, and redundancy. On the Intel Lab Data dataset, DQN adaptive sampling reports average data quality 0.83, average energy consumption 93.6 mJ, redundancy rate 15.1%, and critical event detection rate 89.3%, compared with 0.72, 145.3 mJ, 38.6%, and 82.1% for fixed-frequency sampling (Huang et al., 12 Apr 2025).

In opportunistic mobile social networks, context is the previous sensing interval’s contact graph. HCONTEXT defines node observability for a non-sensing node $s_t$ 0 as

$s_t$ 1

and coverage utility for a sensing node $s_t$ 2 as

$s_t$ 3

Each round retains the top- $s_t$ 4 current sensors by $s_t$ 5 and fills the remaining $s_t$ 6 slots with non-sensors of highest $s_t$ 7. On SIGCOMM and UIM traces, HCONTEXT consistently outperforms RANDOM and GREEDY in sensing coverage, and RANDOM often outperforms GREEDY because greedy degree-based selection leads to heavy overlap and poor marginal gains (Nguyen et al., 2017).

A third allocation variant appears in multi-fidelity importance sampling. There the sampled object is the number of high-fidelity draws $s_t$ 8, jointly optimized with surrogate fidelity $s_t$ 9 through

$G_{t_s}$ 0

under an MSE bound

$G_{t_s}$ 1

The optimization $G_{t_s}$ 2 subject to $G_{t_s}$ 3 selects surrogates with lower fidelity than traditional model-reduction tolerances, yielding runtime speedups of up to one order of magnitude in the presented examples (Alsup et al., 2020). In this form, CAES becomes computational budget allocation rather than object selection.

6. Large output spaces, structured prediction, and cross-cutting limitations

CAES also appears when the sampled object is a support set in a very large output space. DynaSpec partitions the vocabulary into clusters $G_{t_s}$ 4, uses a lightweight router to score clusters from the current context, and forms a context-dependent shortlist

$G_{t_s}$ 5

from the union of the top- $G_{t_s}$ 6 selected clusters. The drafter computes only over $G_{t_s}$ 7, while the target model still verifies over the full vocabulary, so exactness is preserved. On Llama-3-8B-Instruct with a 128k vocabulary, mean accepted length averages 3.90 for DynaSpec with an effective shortlist of about 27k, compared with 3.74 for FR-Spec with a fixed 32k shortlist and 4.00 for full-vocabulary EAGLE-2; on Code the corresponding numbers are 4.71, 4.11, and 4.77 (Zhang et al., 11 Oct 2025).

Structured prediction in knowledge graphs uses a different support-selection problem. MuCoS defines multi-context-aware sampling over head, relation, and tail contexts, scores entities by density

$G_{t_s}$ 8

and keeps top- $G_{t_s}$ 9 high-density neighbors to form optimized contexts such as

$c$ 0

These aggregated contexts are concatenated and passed to BERT, and cross-entropy over relations or entities removes the need for negative triplet sampling. On KEGG50k, MuCoS improves over existing models by 13% on MRR, 7% on Hits@1, 4% on Hits@3, and 18% on Hits@10 for the general relationship task, and by 6% on MRR, 1% on Hits@1, 3% on Hits@3, and 12% on Hits@10 for drug-target relationship prediction (Gul et al., 25 Feb 2025).

Cross-cutting limitations are consistent. VADER notes dependence on anomaly-scorer quality, a bias toward high-motion events, object-centric downstream modeling, and fixed per-event sampling counts (Cheng et al., 10 Nov 2025). The DQN-based sensor framework notes that detailed definitions of $c$ 1, $c$ 2, and $c$ 3 must be carefully designed and validated in a real implementation and assumes a centralized controller (Huang et al., 12 Apr 2025). DynaSpec can fail through misrouting or domain shift, though full-vocabulary verification preserves the target distribution (Zhang et al., 11 Oct 2025). FA-TS also notes that a general formulation of Context-AwarE Sampling is not spelled out (Bouneffouf, 2014). Taken together, these limitations indicate that CAES remains a task-specific family of context-conditioned allocation mechanisms rather than a settled formal framework.