Availability Region (AR) in Embedded Systems
- Availability Region (AR) is a dynamic subset of real-time tasks maintained as 'safe' by excluding those that violate integrity policies.
- It leverages hardware finite state machines to swiftly detect violations and remove misbehaving tasks without adding run-time overhead.
- AR ensures robust schedulability by allowing non-violating tasks to continue execution, preserving system availability even during faults.
An Availability Region (AR) is a dynamically maintained set of real-time tasks deemed "safe"—i.e., free of detected run-time integrity violations—at any given instant during the execution of a real-time embedded system. Introduced and formalized in the PAIR (Preserving Availability And Integrity at Run-time) framework, AR resolves the conflict between system integrity and operational availability by enabling swift exclusion of compromised tasks while permitting all non-violating tasks to continue under standard real-time scheduling. The AR construct is directly encoded in hardware, imposes no run-time overhead on user tasks, and is fully integrated with commercial RTOSes and microcontroller targets (Caulfield et al., 18 Nov 2025).
1. Formal Definition and Representation
The Availability Region is defined over a system comprising independent real-time tasks running under an RTOS. At any time, the AR is a subset consisting of tasks yet to violate their assigned run-time integrity policies. Equivalently, AR is implemented as an -bit vector , with indicating and indicating exclusion.
Critical formal properties of AR are stated using linear temporal logic (LTL):
- Post-Violation Availability (P1):
- Post-Violation Integrity (P2):
Here, is the region of program memory containing all tasks; and are signals indicating a breach detected by the integrity monitor or by illegal access to PAIR-critical data, respectively; is a non-maskable interrupt; is the next-state operator; and denotes "always" globally.
2. Mathematical Model and Update Mechanism
Tasks occupy program-memory intervals . succinctly describes current AR membership. The operations on ARen are dictated by:
- Upon detection of a violation (via integrity monitor or illegal memory access), is performed and the PAIR trigger fires.
- On the completion of a trusted software update, AR is fully reinstated: (all bits 1).
Formally, the requirement to clear from AR upon trigger is expressed as
where sets only the th bit and zero is the all-zero vector.
3. Runtime Algorithmic Operation
PAIR instantiates AR management entirely in hardware as small finite state machines (FSMs) that track program counter, memory-access signals, and interrupts in parallel with the CPU. The following pseudocode summarizes its runtime logic:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
for i in 0…N-1: ARen[i] ← 1 # all tasks start in AR on integrity_monitor.violation() or illegal_access(): trigger_nmi() # non-maskable interrupt to RTOS ARen[taskid] ← 0 # remove violating task from AR on nmi_from_PAIR(): RTOS.kill(current_task) RTOS.schedule_next(ready_tasks ∩ AR) on PC == SWexit: ARen ← set # reinstate AR completely |
All AR transitions are thus atomic, bounded by a single interrupt and RTOS context-switch. No modifications are required to the RTOS except a "kill-and-yield trampoline" at the NMI vector and marking the trusted update routine's exit point.
4. Key Safety, Liveness, and Schedulability Properties
Key formal properties established in PAIR's framework include:
- Safety (Post-Violation Integrity):
- Only tasks that have not violated their integrity policies remain eligible for dispatch ().
- Violating tasks are irrevocably barred until an explicit software update (guaranteed by LTL sub-properties).
- Liveness (Post-Violation Availability):
- After removing a violating task, the system always schedules the next available task from ; the remainder of the system proceeds uninterrupted.
- Catastrophic all-system aborts are avoided; availability is maximized subject to integrity enforcement.
- Schedulability:
- No additional run-time overhead is introduced on the system’s critical path.
- All removals incur only the single NMI context-switch, with RTOS schedulability guarantees unchanged for non-violating tasks.
5. Example Scenarios and Applications
PAIR's AR mechanism is evaluated on RIOT RTOS examples and BEEBS benchmarks:
- In a "sched_round_robin" scenario with two tasks and at equal priority, AR is initialized to . If triggers a Control-Flow Integrity (CFI) violation, PAIR’s hardware triggers NMI, is killed, and AR reduces to . The scheduler continues with exclusively; is excluded until a software update.
- In the BEEBS "lcdnum" benchmark, repeated violation by one task leaves the well-behaved task in AR, maintaining correct functionality for system-critical operations while preventing propagation of erroneous or compromised behavior.
6. Integration with RTOS and Hardware Stack
PAIR’s architecture is entirely hardware-assisted, using two small FSMs:
- A "Trigger FSM" to process violation assertions and manage the NMI line.
- An "AR-FSM" to update membership on violation and software update events.
Critical state (ARen, task bounds, and ) is protected in a dedicated memory region (DPAIR). The design requires only minimal additions to integrate with existing RTOSes, such as RIOT OS, leveraging existing scheduler and context-switch mechanisms for AR transitions.
The PAIR core introduces only 10 lookup tables (LUTs) and zero flip-flops to the openMSP430 baseline (about 1.3% overhead of a CFI-only build). Combined PAIR+IM footprint remains smaller than many CFI or enclave-based mechanisms, and memory overhead for ARen and task bounds stays below 2.3% of available data memory in tested configurations.
7. Performance and Deadline Guarantees
PAIR’s management of AR is off-path, and all monitoring occurs in hardware parallel to program execution. There is strictly zero execution time overhead on user tasks except for the instantaneous NMI and subsequent context switch, both already accounted for in real-time scheduling policies. Empirical measurements on a Basys3 FPGA show a static power increase of only +5 mW, with the PAIR core requiring 2–3x less area than software-intensive integrity-checking alternatives.
As revocation of tasks is handled by the RTOS’s normal preemption logic, existing worst-case execution time (WCET) and schedulability analyses remain valid for non-violating tasks. Removal of a misbehaving task may in some instances result in less CPU contention and improved deadline satisfaction for the remaining admissible workload (Caulfield et al., 18 Nov 2025).