Papers
Topics
Authors
Recent
Search
2000 character limit reached

Timing-Anomaly-Free Scheduling

Updated 4 February 2026
  • The paper introduces the Deterministic Dynamic Execution (DDE) algorithm, which eliminates timing anomalies by enforcing fixed resource allocation and strict execution order.
  • It uses a single offline simulation to safely estimate tight worst-case response times while reducing pessimism and computational complexity.
  • Experimental results demonstrate up to a 72% reduction in worst-case response time and a 7–9% decrease in jitter, enhancing predictability in dynamic scheduling.

A timing-anomaly-free dynamic scheduling algorithm is a scheduling technique developed for heterogeneous systems that eliminates timing anomalies—situations where local reductions in task execution time paradoxically result in increased global execution time—by enforcing deterministic constraints on resource allocation and execution order. The Deterministic Dynamic Execution (DDE) algorithm is the first such approach for heterogeneous systems, supporting both safe and tight worst-case response time (WCRT) analysis through a single offline simulation, while reducing pessimism and complexity relative to traditional scheduling strategies (Zhu et al., 28 Jan 2026).

1. System and Task Model

The DDE algorithm is defined for heterogeneous platforms composed of a set of processing-unit instances:

PU={CPU00,CPU01,CPU10,CPU11,GPU00,GPU10,}PU = \{ CPU_0^0, CPU_0^1, CPU_1^0, CPU_1^1, GPU_0^0, GPU_1^0, \ldots \}

Each instance is specified by its architecture AA (e.g., CPU, GPU), micro-type TT (e.g., high-performance CPU0_0 versus low-power CPU1_1), and index II, yielding the notation ATIA_T^I.

Tasksets are represented as multi-typed DAGs. A DAG G=(V,E,M,Γ)G=(V, E, M, \Gamma) consists of:

  • VV: finite set of tasks (nodes)
  • EV×VE \subseteq V \times V: precedence constraints (edges)
  • M:V2PUTM: V \to 2^{PUT} \setminus \emptyset: maps each task to its eligible processor-unit types
  • Γ:V×PUTN>0×N>0\Gamma: V \times PUT \to \mathbb{N}_{>0} \times \mathbb{N}_{>0}: gives execution-time intervals as [BCET(v,put),WCET(v,put)][\mathrm{BCET}(v, put), \mathrm{WCET}(v, put)] for best-case and worst-case execution times

The response time of a task vv, RT(v)RT(v), is the time from the DAG source ssrcs_{src} until vv’s finish time fvf_v, with worst-case response time for a scheduler π\pi defined as:

WCRTπ(G)=maxSSπRTS(vsink)\mathrm{WCRT}_\pi(G) = \max_{S \in \mathbb{S}_\pi} RT_S(v_{sink})

where Sπ\mathbb{S}_\pi is the set of legal schedules under π\pi.

2. Timing Anomalies in Dynamic Scheduling

Timing anomalies occur in dynamic scheduling when local improvements—such as reductions in execution time for individual tasks—lead to increased global response times, either by altering resource contention or task dispatch order.

Formal Model: System execution is represented as a finite-state machine mapping each tVt \in V to a progress tuple p=(stage,remaining_time){Block,Ready,Exec,Finish}×N0p = (\text{stage}, \text{remaining\_time}) \in \{\text{Block}, \text{Ready}, \text{Exec}, \text{Finish}\} \times \mathbb{N}_0. A strict timing anomaly is present if, for two reachable states c1,c2c_1, c_2:

  • There exists a non-empty subset SV\mathbb{S} \subseteq V such that all tSt \in \mathbb{S}, c1(t)c_1(t) is strictly less progressed than c2(t)c_2(t), while c1(t)=c2(t)c_1(t) = c_2(t) for all other tasks
  • Future execution from c1c_1 is at least as large as from c2c_2 for tasks not in execution
  • Yet, after some number of steps nn, the state derived from c1c_1 is no longer ahead of that from c2c_2

Illustrative Example: Under the HBFS policy, a local reduction in a task’s execution time may shift resource contention, indirectly delaying the critical chain and resulting in a higher overall completion time than the case with no reduction [(Zhu et al., 28 Jan 2026), see Fig. 3].

3. Deterministic Dynamic Execution Algorithm

DDE enforces two primary constraints at runtime to prevent timing anomalies:

Constraint 1: Resource-Allocation Determinism

Each task tt is fixed in advance to execute on a predetermined processor type resAllocPatD(t)\mathrm{resAllocPat}_D(t).

Constraint 2: Execution-Order Determinism

Tasks are ordered offline into a total order OD=(t1t2tN)O_D = (t_1 \preceq t_2 \preceq \dots \preceq t_N) that respects DAG precedence. No task may begin execution before all its predecessors in ODO_D have reached the "started" state.

DDE Execution-Progress Model

  • The system state C=VPC = V \rightarrow P, with initial state c0(t)c_0(t) set to (Block,0)(\text{Block}, 0), except for the source node which starts at (Ready,0)(\text{Ready}, 0).
  • The transition function upd:CC{\tt upd}: C \rightarrow C updates progress of each task by checking dependency completion, resource availability, and advancing time.

Algorithm Sketch

At each time tick:

  1. Add newly ready tasks to ReadyQ.
  2. Select the ReadyQ task with minimum index in ODO_D.
  3. If all its ODO_D predecessors have started and a resource of type resAllocPatD(t)\mathrm{resAllocPat}_D(t) is available, dispatch and mark as started.
  4. Advance one tick, decrement execution counters, and update stages.

Computational complexity is O(V+E+Tmaxμ)O(|V| + |E| + T_{max} \cdot \mu), with TmaxT_{max} the simulated WCRT and μ\mu the number of compute units.

4. WCRT Estimation via Offline Simulation

Under DDE, WCRT is determined by a single offline simulation where each task is allocated its WCET on the assigned resource. Due to monotonicity of execution progress under the DDE constraints, the simulated response time bounds all real executions:

WCRTDDE(G)=RTsimallWCET(vsink)\mathrm{WCRT}_{\mathrm{DDE}}(G) = \mathrm{RT}_{\mathrm{sim}^{\mathrm{all-WCET}}}(v_{sink})

This simulation avoids the need for exhaustive state-space exploration ordinarily required when timing anomalies are possible.

5. Generation of Execution Constraints

Constraints for DDE are generated by two methods.

(a) Trace-Based Extraction: Run the baseline scheduler (e.g., HBFS) offline on the DAG with all tasks at their WCETs. Extract, for each task tt, its start and finish times and resource type. Sort tasks by start time (ties broken by ID) to derive ODO_D, and fix resAllocPatD\mathrm{resAllocPat}_D to the corresponding instance type.

(b) Heuristic HACPA Method: Assign ranks bottom-up as

rank(t)=avg_WCET(t)+maxusucc(t)rank(u)\text{rank}(t) = \text{avg\_WCET}(t) + \max_{u \in succ(t)}\text{rank}(u)

Tasks are processed in descending order of rank; each receives the processor type and resource minimizing its finish time under WCET and dependency constraints. The resulting trace yields ODO_D and resAllocPatD\mathrm{resAllocPat}_D.

6. Formal Guarantees and Anomaly-Freeness

Monotonicity Lemmas:

  1. Every state transition advances at least one task’s progress.
  2. Dependency completions are preserved under progress ordering.
  3. Advancement in any task's progress in a "smaller" state remains at least as advanced in a "larger" state after a transition.

Strict-TA-Free Theorem (Theorem 3): Under the two DDE constraints, for all relevant state pairs and all n0n\geq 0, the simulation from a state strictly ahead in progress remains (at least) as advanced at every later step. This ensures absence of both strict and generic timing anomalies, so the simulated WCRT is both safe (upper-bounding real execution) and tight.

7. Experimental Evaluation and Results

Tasksets and Configuration

  • Random task DAGs of size n{10,20,30,40}n \in \{10, 20, 30, 40\} with density p[0.1,0.9]p \in [0.1, 0.9]
  • Each DAG given 100 configurations: 4 processor types (CPU0_0, CPU1_1, GPU0_0, GPU1_1)
  • Node execution intervals: 80% assigned heavy (WCET \approx 10–30×\times BCET), 20% light (WCET \approx 1–1.2×\times BCET)
  • Resource availability: 1, 2, or 4 instances per type

Metrics

Metric Definition
MSWCRTX_X Maximum observed RT over 10,000 random runs under scheduler XX
WCRTX_X Offline simulated all-WCET bound under scheduler XX
AVRTX_X Average response time
jitterX_X (MSWCRTXMSBCRTX)/MSWCRTX(\mathrm{MSWCRT}_X – \mathrm{MSBCRT}_X)/\mathrm{MSWCRT}_X

Results

  • Probability of timing anomaly: up to 45% under HFCFS for sparse graphs; \leq3% for HBFS.
  • DDE eliminates all observed timing anomalies: MSWCRTDDE=WCRTDDE\mathrm{MSWCRT}_{DDE} = \mathrm{WCRT}_{DDE}.
  • DDE reduces WCRT by average 5–25% compared to baseline MSWCRT, with best-case reduction of 72%. HACPA variant provides an additional 2–5% reduction.
  • Jitter reduced by 7–9% on average.
  • Tradeoff in AVRT: increase by 4.2% overall (from including non-TA cases), while AVRT decreases by 1.2% for TA-exhibiting systems (best case reduction −40%).

8. Significance and Implications

Deterministic Dynamic Execution establishes the first provably strict timing-anomaly-free algorithm for dynamic scheduling on heterogeneous systems. By imposing lightweight, offline-derived deterministic constraints, it converts a challenging real-time analysis problem—complicated by non-monotonic progress of dynamic schedules—into a tractable single simulation. The approach demonstrably eliminates timing anomalies, allows safe and tight WCRT estimation, reduces worst-case and jitter metrics, and does so with minimal negative impact on average-case performance. A plausible implication is that DDE provides a scalable foundation for predictable execution in safety-critical and hard real-time applications deployed on modern heterogeneous compute platforms (Zhu et al., 28 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Timing-Anomaly-Free Dynamic Scheduling Algorithm.