Age-Dependent Multi-Threshold Policy

Updated 27 December 2025

Age-dependent multi-threshold policy defines specific AoI thresholds that trigger optimal actions, ensuring timely updates in resource-constrained systems.
The framework leverages MDP/SMDP models with methods such as convex optimization, deep reinforcement learning, and bisection to derive precise threshold values.
It offers scalable, decentralized control in heterogeneous systems by simplifying decision rules and enhancing performance in latency, energy, and cost tradeoffs.

An age-dependent multi-threshold policy is a class of deterministic, stationary decision rules in discrete- or continuous-time stochastic control, scheduling, or resource allocation problems. These policies leverage the Age of Information (AoI)—the time elapsed since the most recent successful update, decision, or event—as a central state variable, and assign optimal action regions based on threshold crossings in the AoI (possibly vector-valued) or related metrics. Distinct thresholds are defined per action, user, channel, file, or system configuration, yielding a "multi-threshold" structure where actions switch precisely at critical AoI values. This framework gives rise to analytically tractable and highly efficient policies in diverse domains, providing substantial performance gains, implementability, and interpretability, especially in systems suffering from complex, heterogeneity-induced tradeoffs among latency, reliability, information-freshness, cost, and resource constraints.

1. Mathematical Structure and General Principles

The canonical mathematical foundation of age-dependent multi-threshold policies is the Markov Decision Process (MDP) or Semi-Markov Decision Process (SMDP) with a cost or reward functional depending on AoI and possibly additional resource variables (e.g., power, energy, update budgets). The decision at each slot or epoch is based on the state vector

$s_t = ( \Delta_t^1, ..., \Delta_t^N, \text{other components} ),$

where $\Delta^i_t$ is the AoI for user/stream $i$ , and potentially other process or system variables (e.g., number of retransmissions, power buffer, service status).

A multi-threshold policy specifies, for each action or user $i$ and system configuration (such as ARQ round, server index, or file class), a set of thresholds $\{ \tau_{i,k} \}$ such that:

Action $a$ is taken if and only if $\Delta^i > \tau_{i,k}$ , indicating a precise, deterministic state-to-action mapping. When AoI crosses the relevant threshold, the policy switches action—e.g., from idling to updating, from server $j$ to $k$ , or from retransmitting to sending a fresh update.

This threshold mapping is computed as the (unique, structure-exploiting) solution to the constrained optimization or Bellman equations, often under energy, cost, or other resource constraints.

2. Exemplary Models and Derivation of Thresholds

Representative age-dependent multi-threshold policies have been derived for a broad range of information-update and decision systems, including:

(a) Multi-User HARQ-aided NOMA via DRL

In downlink multi-user HARQ-CC NOMA networks, the AoI minimization problem is formulated as an MDP with state comprising the AoI vector, current ARQ round vector, and Chase-combining power-buffers. Utilizing a Double-Dueling Deep Q Network (DQN), the solution is found to exhibit a deterministic multi-threshold structure: for user $i$ in ARQ round $k$ , there exists a threshold $\tau_{i,k}$ such that

$\text{Retransmit if }\Delta_i > \tau_{i,k},\quad \text{otherwise start fresh}.$

These thresholds adapt as functions of user channel-quality $h_i$ , ARQ budget $T_\mathrm{max}$ , and system operational points, and have been tabulated numerically (e.g., for $N=4$ , $T_\mathrm{max}=3$ , SNR $=0$ dB: $\tau_{4,1}=6$ , $\tau_{4,2}=12$ ) (Liu et al., 2023).

(b) Multi-Server, Age-Dependent Server Selection

For single-source, multi-server finite-buffer systems where updates are sent over servers with discrete phase-type–distributed service times and heterogeneous costs, the policy is formalized as a vector $\tau = (\tau_1,\ldots,\tau_J)$ partitioning the AoI domain. At idle epochs, if AoI is in $[\tau_{k-1},\tau_k)$ , the action is to use server $k$ (if $k>0$ ) or idle ( $k=0$ ). The thresholds are optimized exactly via a multi-regime absorbing Markov chain (MR-AMC) framework, yielding the stationary AoI distribution and exact cost expressions for both information age and resource consumption (Akar et al., 20 Dec 2025).

(c) Information Fusion and Distortion Constraints

In status update systems with age-dependent distortion requirements and energy constraints, the optimal policy admits a single (or at most mixture-of-two) threshold structure: update is sent if AoI exceeds a critical value determined by distortion and energy Lagrangian parameters. For piecewise age-dependent distortion, the explicit closed-form threshold maps are derived via KKT conditions, giving complete age-energy-distortion tradeoff characterization (Yao et al., 2023).

(d) File Cache with Age-Dependent Update Durations

For cache updating where refresh durations depend (nonlinearly) on both file identity and its AoI, the age-dependent multi-threshold policy is shown to be asymptotically optimal for minimizing popularity-weighted AoI under one-at-a-time update constraints. Each file $n$ is updated exactly when its AoI hits its own threshold $\tau^*_n$ , with each threshold computed via convex optimization (see Table below) (Tang et al., 2019).

File $n$	Popularity $p_n$	Optimal Threshold $\tau^*_n$ (constant duration)
1	$p_1$	$B_1\cdot \sum_i \sqrt{p_i B_i}/\sqrt{p_1 B_1}$
$\cdots$	$\cdots$	$\cdots$
$N$	$p_N$	$B_N\cdot \sum_i \sqrt{p_i B_i}/\sqrt{p_N B_N}$

(All thresholds obtained via Lagrange-duality and explicit inversion for separable $h_n(\lambda)$ .)

3. Structural Properties and Theoretical Guarantees

Deterministic, Stationary, and Index-Free Mapping

Age-dependent multi-threshold policies are deterministic: for every admissible system state, the prescribed action is unique. These policies are stationary—thresholds do not evolve with time but depend only on system parameters or user/file identities and possibly per-epoch variables (e.g., ARQ attempt count, service regime). They avoid explicit index calculation or global sorting (unlike Whittle-index or Max-Weight policies), vastly simplifying decentralized implementation (Jiang, 2020).

Optimality Justification

Optimality (or asymptotic optimality) derives from the monotonicity and switching-structure of the Bellman equations or value functions in AoI variables: the difference in cost-to-go between two actions becomes strictly ordered as AoI crosses a critical threshold. This induces clear switching boundaries, justifying the threshold scheme—for both additive linear and nonlinear cost functions (e.g., polynomial, logarithmic), under convexity and mild technical conditions. For MDPs with additive-AoI cost and finite state/action space, multi-thresholds are the minimal sufficient structure for deterministic optimality (Liu et al., 2023).

Scalability and Decentralizability

In high-dimensional multi-agent or class-heterogeneous systems, analytic multi-thresholds enable implementation with minimal coordination. Each agent (or class) acts locally when its own AoI hits its threshold, which is broadcast once and remains static; this ensures scalability to large networks where index policies become intractable (Jiang, 2020).

4. Computational and Algorithmic Implementation

Threshold values are computed via:

Convex optimization and Lagrangian duality: Bundle separable file/user/stream costs using a common multiplier (resource constraint), solve KKT or fixed point conditions yielding closed-form or one-dimensional search problems (Tang et al., 2019, Yao et al., 2023).
Value iteration and policy iteration: Apply DP (value or policy iteration) exploiting the monotonicity of value differences to identify threshold crossing points (Gong et al., 2021).
Bisection and Dinkelbach’s method: For fractional or parametric optimization, find roots where the cost difference crosses zero (Dinkelbach), typically requiring only $O(\log(1/\epsilon))$ computations (Pan et al., 2020).
Machine learning (deep reinforcement learning): When system dynamics are complex, thresholds can be emergent properties extracted from trained neural Q-functions, as in DRL-assisted policies (Liu et al., 2023).

Numerical algorithms for multi-threshold computation are efficient, leveraging monotonicity, convexity, and explicit structure in the system solution.

5. Applications and Performance Comparisons

The age-dependent multi-threshold paradigm has been instantiated in the following domains:

AI-6G networks and NOMA HARQ: Achieves 15–30% lower average AoI compared to heuristic single-threshold policies and up to 50% lower AoI at low SNR than static allocation (Liu et al., 2023).
Multi-server status updating: Provides precise AoI/cost tradeoffs, outperforming basic server selection or fixed-threshold rules (Akar et al., 20 Dec 2025).
Cache updating: Yields reductions of up to 50% over classical $\sqrt{p}$ -based policies when practical age-dependent update costs are present (Tang et al., 2019).
Multi-class queueing and value-of-information tradeoffs: Delivers class-aware thresholds, optimizing AoI–VoI convex blends (Arafa et al., 22 Aug 2024).
Hybrid and unreliable channel scheduling: Outperforms fallback or always-fast/slow baselines by up to 40% in relevant regimes (Pan et al., 2020).
Energy-constrained IoT: Two- or multi-threshold structures broaden the achievable freshness–energy tradeoff frontiers beyond single-threshold or non-adaptive policies (Gong et al., 2021).

Notably, decentralizability and the simplicity of threshold-based action determination make these schemes well-suited to large, heterogeneous, or real-time systems.

6. Interpretations, Extensions, and Limitations

The age-dependent multi-threshold framework is robust to diversity in channel conditions, service-time statistics, server heterogeneity, file popularities, and even dynamic information-value or distortion metrics. It extends naturally to broader cost functions (polynomial, logarithmic, AoII) and to multi-dimensional or stacked AoI states (Cosandal et al., 11 Jul 2024).

The principal limitation is computational: for very high-dimensional or complex systems (e.g., full CTMC sources with state- and estimate-aware policy spaces), the number of thresholds can become unwieldy, necessitating suboptimal but structurally-reduced policies (e.g., single- or class-threshold reductions) (Cosandal et al., 11 Jul 2024).

A plausible implication is that for large-scale deployments where coordination cost, latency, or complexity are paramount, threshold-based architectures offer a uniquely scalable and interpretable solution class.

7. Summary Table: Multi-Threshold Structures Across Domains

Application Domain	AoI Variable(s)	Number/Type of Thresholds	Policy Action Regions	Reference
HARQ-aided NOMA (DRL)	$\Delta_i$ , ARQ $k$	$\tau_{i,k}$ per (user, ARQ)	Retransmit $⇔\Delta_i>\tau_{i,k}$	(Liu et al., 2023)
Multi-server status updating (MR-AMC)	$\Delta$	$\tau_1,…,\tau_J$ per server	Use server $k$ in $[\tau_{k-1},\tau_k)$	(Akar et al., 20 Dec 2025)
Cache updating (files/popularity)	$X_n$	$\tau_n$ per file	Update $n$ when $X_n\geq\tau_n$	(Tang et al., 2019)
Multi-class Age–Value systems	$($ age, class $)$	$\bar{y}_i$ per class	Admit class- $i$ when age $<\bar{y}_i$	(Arafa et al., 22 Aug 2024)
Hybrid channel scheduling	$\Delta$ , last-link	$\lambda_0, \lambda_1$	mmWave if $\Delta<\lambda_*$ , else slow	(Pan et al., 2020)
Energy-age tradeoff IoT	$(a_T,a_R)$	$\tau_T,\tau_R$	Sleep, retransmit, or sense/send	(Gong et al., 2021)
Social Security claiming	age $t$	$x^*_t$ per age	Claim if $W_t/PIA < x_t^*$	(Diamond et al., 2021)

In all cases, threshold values are explicit analytic functions of system parameters and cost weights, and policy regions are defined by partitioning the AoI (or extended state) space. The multi-threshold structure is fundamental for achieving analytically justified, high-performance status updating, scheduling, and decision making in modern communication, monitoring, and resource management systems.