Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Age of Job Completion Analysis

Updated 13 November 2025
  • Age of Job Completion (AoJC) is defined as the elapsed time from a job's arrival to its completion, capturing end-to-end latency in distributed and stochastic systems.
  • Recent research introduces both optimal and heuristic scheduling algorithms, such as OBTA, Water-Filling, and Replica-Deletion, to minimize AoJC while balancing throughput and stability.
  • Optimization strategies utilize methodologies like MILP, MDP, and Markovian models to jointly address resource allocation, data locality constraints, and sampling costs in job scheduling.

The age of job completion is a metric and analytical framework for quantifying, optimizing, and stabilizing the delay between the arrival and completion of jobs in networked and distributed systems. It is distinct from classical metrics (e.g., response time, makespan) by measuring the time elapsed from job arrival to completion and is applied as both an objective and constraint in online scheduling, data-locality-constrained task assignment, queueing systems with nontrivial machine dynamics, and throughput optimization. Recent studies in distributed execution with data locality (Zhao et al., 11 Jul 2024) and stochastic job assignment with Markovian server states (Mitrolaris et al., 6 Nov 2025) have developed rigorous definitions, problem formulations, and both optimal and heuristic policies for minimizing this age while accounting for constraints such as stability, sampling cost, and heterogeneous service capabilities.

1. Formal Definition and System Contexts

The age of job completion (AoJC), often denoted as agec=Ccacage_c = C_c - a_c or Φc\Phi_c for job cc, represents the interval between the arrival time aca_c and the estimated completion time CcC_c. In single-server queueing systems with multiple users, the instantaneous age for user ii at slot tt is viϕ(t)=tsup{t<t:biϕ(t)=1}v_i^\phi(t) = t - \sup\{t'<t: b_i^\phi(t')=1\}, where biϕ(t)b_i^\phi(t') marks the latest completion under policy ϕ\phi (Mitrolaris et al., 6 Nov 2025). For batch scheduling in data-locality-constrained environments, CcC_c is the maximal finishing time across all servers assigned tasks from job cc, accounting for the outstanding backlogs and server-specific service rates (Zhao et al., 11 Jul 2024). Extension to long-run averages defines Δiϕ\Delta_i^\phi and Δϕ\Delta^\phi as the time-averaged age per user and system-wide, respectively.

AoJC is employed to:

  • Capture end-to-end latency for jobs or users.
  • Directly align scheduling with throughput maximization (completed jobs per time unit).
  • Guide trade-offs between quick completions and resource stability in dynamic systems.

2. Optimization Formulations

Minimization of AoJC is framed in two principal models:

a. Distributed Data-Locality-Aware Scheduling (Zhao et al., 11 Jul 2024):

  • Variables: For each job cc, tasks are grouped by identical server-availability sets into KcK_c groups. Servers mMm\in\mathcal{M} each possess profiled capacities μmc\mu_m^c and instantaneous backlog omco_m^c.
  • Objective: On every job arrival, solve

minΦc\min \Phi_c

subject to per-server and per-task-group constraints: 1. knmkmax{Φcbmc,0}\sum_k n_m^k \leq \max\{\Phi_c - b_m^c, 0\}, 2. mScknmkμmcTck\sum_{m\in\mathcal{S}_c^k} n_m^k \mu_m^c \geq |\mathcal{T}_c^k|, where nmkn_m^k represents time slots assigned by server mm to group kk.

b. Job Assignment with Markovian Machine States (Mitrolaris et al., 6 Nov 2025):

  • Variables: Multiple user queues Qi(t)Q_i(t), Bernoulli job arrivals ai(t)a_i(t), binary machine state sampled at cost LL, stochastic external job assignment.
  • Objective:

minϕ(Δϕ+Sϕ)\min_\phi (\Delta^\phi + S^\phi)

subject to queue stability and action constraints, where SϕS^\phi is the long-term sampling cost.

Constrained optimization is typically solved via MDP/stochastic control approaches, or by parameterizing randomized or round-robin scheduling policies and tuning adaptive sampling frequencies.

3. Algorithms and Policies for AoJC Minimization

A range of algorithms have been proposed and analyzed:

A. OBTA (Optimal Balanced Task Assignment) (Zhao et al., 11 Jul 2024):

  • Decomposes nonlinear integer program via MILP subproblems, using bounds Φc\Phi_c^- and Φc+\Phi_c^+ for search-space restriction.
  • Iteratively checks intervals sorted by distinct server busy times, exploiting linearity within each subrange for tractable MILP solution.
  • Optimality is guaranteed with exact service profiles; complexity is O(Kc+M)O(K_c + M) MILP solves per job.

B. Water-Filling (WF) Heuristic (Zhao et al., 11 Jul 2024):

  • Assigns tasks group-by-group, raising server busy times incrementally (“pour water” analogy), using binary search to find minimum slot increments for each group.
  • Computational cost O(KcMlogTc)O(K_c M \log |\mathcal{T}_c|).
  • Approximation factor proven to be KcK_c; worst-case instances achieve this bound.

C. Replica-Deletion (RD) Heuristic (Zhao et al., 11 Jul 2024):

  • Initially assigns all possible task replicas, then iteratively deletes excess assignments from the most loaded servers, prioritizing tasks with many alternatives.
  • Empirical performance yields ages close to OBTA; computational overhead O(M2nlogn)O(M^2 n \log n) per job.

D. Job Reordering via Shortest-Estimated-Time-First (OCWF-ACC) (Zhao et al., 11 Jul 2024):

  • Maintains outstanding job set OO, builds new execution order QQ by repeatedly selecting the job with smallest WF-estimated remaining age.
  • Implements early-exit pruning by lower-bounding age estimates.

E. Centralized Policies in Markov Machine Setting (Mitrolaris et al., 6 Nov 2025):

  • Adaptive randomized scheduling and sampling: For every active subset S\mathcal{S}, precompute sampling probabilities μ(S)\mu^*(\mathcal{S}) and scheduling distributions π(S)\pi^*(\mathcal{S}) via convex nonlinear programs, plug in closed-form expressions.
  • Max-age scheduling (round-robin among active users with highest age) combined with stationary optimized sampling μˉ(S)\bar{\mu}(\mathcal{S}).

4. Stability Conditions and Analytical Age Expressions

Stability of AoJC-minimizing policies requires strict control of the arrival-service rate gap:

Sufficient Condition Policy Formula (for all nonempty S\mathcal{S})
Adaptive Randomized (Prop 1) Randomized Scheduling jpjμ(S)[1χ(q,s)]iSπi(S)qiϵ\sum_j p_j - \mu(\mathcal{S})[1-\chi(q,s)] \sum_{i \in \mathcal{S}} \pi_i(\mathcal{S}) q_i \le -\epsilon
Max-age (Prop 2) Round-Robin Scheduling jpjμ(S)[1χ(q,s)]qmin(S)ϵ\sum_j p_j - \mu(\mathcal{S})[1-\chi(q,s)]q_{\min (\mathcal{S})} \le -\epsilon

Here χ(q,s)\chi(q,s) encodes the Markov machine’s idle/busy transition structure, and ϵ>0\epsilon > 0 quantifies strict inequality required for positive recurrence.

Analytical expressions for long-run average age are derived for each policy, enabling local optimization. For example, for adaptive randomized scheduling:

Δk(S)=1(sq+2(1μ1)+ηˉ)(ψk2πk+(1qk+1sq2)ψk+1q[(1s)(1πkηk)1μ]πk(1qk)qk+iSπi(1qi)qi2)+1\Delta_k(\mathcal{S}) = \frac{1}{\left( \frac{s}{q} + 2(\frac{1}{\mu} - 1) + \bar{\eta} \right)} \Bigl( \frac{\psi_k^2}{\pi_k} + (\frac{1}{q_k} + \frac{1-s}{q} - 2)\psi_k + \frac{1}{q}[(1-s)(1-\pi_k - \eta_k) - \frac{1}{\mu}] - \frac{\pi_k(1-q_k)}{q_k} + \sum_{i \in \mathcal{S}} \frac{\pi_i(1-q_i)}{q_i^2} \Bigr) + 1

where the auxiliary quantities are as defined in (Mitrolaris et al., 6 Nov 2025).

Sampling cost under stationary randomized sampling, as per Theorem 2: Sϕ(S)Subϕ(S)=(L+1)μp(1/(μp+ηˉ))S^\phi(\mathcal{S}) \le S^\phi_{\mathrm{ub}} (\mathcal{S}) = \frac{(L+1)\mu}{p^* (1 / (\mu p^* + \bar{\eta}))} with p=q/[1(12q)(1μ)]p^* = q / [1 - (1-2q)(1-\mu)].

A plausible implication is that system designers must calibrate both the scheduling and sampling frequency jointly to ensure stability and minimize AoJC.

5. Empirical Evaluation and Practical Implications

Trace-driven and simulation studies rigorously validate theory:

  • For distributed scheduling with data locality (Zhao et al., 11 Jul 2024), OBTA achieves optimal ages, WF obtains ages within a few percent of OBTA at 100×\sim100\times lower cost, RD closes the gap by 12%1-2\% with moderate overhead.
  • Job-reordering (OCWF-ACC) further reduces mean age, maintaining performance even under highly skewed workloads.
  • In Markov machine systems (Mitrolaris et al., 6 Nov 2025), round-robin (max-age) scheduling with optimized sampling outperforms adaptive randomized policies, particularly under high traffic. Both age and cost decrease with faster transition rate qq.
  • Sufficient stability conditions occasionally underestimate the practical regime; queues may remain stable outside the proven sufficient region, suggesting the conditions are conservative.
  • System utilization increases absolute ages but the performance hierarchy of policies is preserved.

These results underscore that AoJC-centric scheduling provides a robust, throughput-aligned, and verifiable method for job assignment in environments ranging from distributed compute clusters to dynamic central-server queueing systems.

6. Significance and Interpretative Remarks

Adoption of AoJC as a primary metric implies an operational focus on completed job rates, online tractability, and fairness among users/jobs. Its applicability to both deterministic MILP-based scheduling (Zhao et al., 11 Jul 2024) and stochastic controlled queueing (Mitrolaris et al., 6 Nov 2025) indicates methodological generality. The analytic forms for age and cost enable explicit policy tuning, unlike blackbox simulation methods. A plausible implication is that future extensions may integrate AoJC within broader resource optimization (energy, reliability, SLA), or generalize job priorities and dependencies.

Furthermore, these frameworks reveal that:

  • Simple heuristics (water-filling, replica deletion, round-robin max-age) can yield near-optimal AoJC at orders-of-magnitude lower computation than offline optimal assignment.
  • Carefully designed sampling policies are essential in systems with non-work-conserving machine states; undersampling yields idle service capacity, oversampling incurs unnecessary cost.
  • Data locality and job structure must be explicitly incorporated into task assignment to avoid worst-case approximation factors.

7. Relation to Contemporary Research and Outlook

Current research (Zhao et al., 11 Jul 2024, Mitrolaris et al., 6 Nov 2025) emphasizes AoJC in diverse system architectures—including distributed clusters with partial data replication, FIFO queues, Markov-modulated service capabilities—and establishes theoretical complexity, optimality, and practical efficiency. Empirical validation against proprietary production traces (e.g., Alibaba Batch Trace) ensures relevance of findings.

The AoJC perspective complements and enhances existing metrics such as average response time, makespan, and freshness age, suggesting further directions in multi-resource and multi-criteria scheduling. Adoption in operational systems will require integration with workload forecasting, adaptive policy deployment, and resilience against adversarial arrival and failure scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Age of Job Completion.