Cyclic Memory Protection (CMP) Overview
- Cyclic Memory Protection (CMP) is a technique that segments system execution into discrete cycles to enforce memory protection and robust control-flow integrity.
- It employs per-cycle MPU reconfiguration, randomized stack allocation, and cryptographic validations to prevent attacks like ROP, return2libc, and return2shellcode.
- CMP demonstrates practical benefits in real-time and concurrent settings, achieving significant throughput improvements (e.g., +892% over Moodycamel) and low overhead per cycle.
Cyclic Memory Protection (CMP) is a methodology that enforces memory and resource safety by segmenting system or data-structure execution into discrete, repeating cycles, within which memory protection and reclamation are governed according to cycle-aligned invariants. CMP arises independently in both real-time unmanned systems (through cycle-task-oriented memory protection) and in the design of highly concurrent lock-free data structures (providing simple, robust memory reclamation for concurrent queues). CMP’s core premise is to bound the temporal window during which memory or control state can be accessed, thereby enabling low-overhead, coordination-free safety guarantees.
1. CMP in Real-Time Systems: Cycle-Task-Oriented Memory Protection
In the context of unmanned systems, as instantiated by CToMP, CMP partitions execution into a sequence of strictly timed cycles indexed by , each of length , where is the global main loop frequency. Within each , a fixed set of tasks is dispatched by the RTOS scheduler according to priority and timer metadata.
For each , a control-flow graph (CFG) is constructed, where are entry/exit nodes of all code modules and are the observed (or allowed) control-flow transitions. The system enforces control-flow integrity (CFI) at the cycle granularity:
- Prior to , the MPU is reprogrammed once to establish a memory-view for all code/data/buffers accessed in the cycle.
- A randomized, per-cycle stack (process stack pointer, PSP) and dynamic buffers are allocated; all such allocations are made inaccessible after .
- At the cycle’s end, the system validates that only legal transitions in occurred, verifying this using a cryptographic cycle counter hash (e.g., over privileged update logs) to check for unauthorized state changes.
Privilege switches and MPU region changes occur only twice per cycle (start/end), amortizing costs and reducing preemption windows and attack surfaces compared to per-task protection schemes.
2. Taxonomy and Modeling of Memory Corruption Attacks
CMP in CToMP formalizes three principal classes of memory corruption attacks (MCAs) in embedded and unmanned systems:
- return2libc: Stack return addresses are overwritten to direct execution to privileged existing system functions (e.g.,
kill(),up_pwm_servo_set()), with attackers controlling arguments but unable to inject new code. - return2shellcode: Ability to hijack the return address to target attacker-injected code residing on stack or heap, enabling arbitrary code execution.
- ROP (Return-Oriented Programming): Chaining of pre-existing short instruction sequences (“gadgets”) ending in
ret, constructing complex illicit control flow by traversing gadgets in CFG.
These attacks are all modeled as unauthorized traversals of the current cycle’s CFG , specifically edges targeting privileged code or unprotected memory.
3. Cycle-Granular Control-Flow Integrity Enforcement
CToMP’s CMP approach to CFI is characterized by:
3.1 Pre-Cycle Setup
- On each SysTick (or equivalent), a privileged “cycle_start” handler establishes the MPU regions for .
- A new process stack of size is allocated at a random offset within a dedicated memory pool using a hardware TRNG, and mapped with the MPU to be accessible in unprivileged mode.
- The system switches to unprivileged mode, and the process stack pointer (
PSP) is set to this stack.
3.2 In-Cycle Execution
- All tasks execute in unprivileged mode using the fixed protected memory-view for .
- Attempts to write to protected read-only timer state variables (ticks,
last_run) generate traps.
3.3 Post-Cycle Validation
- Upon the first SVC after the last scheduled task, “cycle_end” handler is invoked in privileged mode.
- The system switches back to the main stack and privileged mode.
- All dynamically allocated per-cycle buffers, including the process stack, are released.
- The SHA256 hash of all privileged update operations is compared to reference values to detect illicit state mutation.
3.4 Runtime Boundary Pseudocode
1 2 3 4 5 6 7 8 9 10 11 12 13 |
function cycle_start():
configure_MPU(z regions)
psp_addr = mem_alloc(STACK_SIZE)
__set_PSP(psp_addr + STACK_SIZE)
__set_CONTROL(SP_PROCESS) // unprivileged stack
while (tasks remain) run next τ
function cycle_end_svc_handler():
__set_CONTROL(SP_MAIN) // privileged stack
ticks += 1
for each τ_j: last_run[j] = ticks
mem_free(all scratch buffers + old PSP)
return_to_thread |
4. Secure Process Stack: Randomization and Memory-Pool Allocation
The secure process stack mechanism randomizes the location of the stack for every cycle to prevent return2shellcode and related attacks. The process is as follows:
- At link time, reserve a contiguous memory pool of size with base .
- To allocate a stack for , let be the output of the on-chip TRNG.
Ensure alignment and avoid overlap with other regions using an allocation table (≤72 B for dynamic regions).
- Stack and all per-cycle allocations are marked read/write in unprivileged mode by the MPU, with all other addresses unreachable.
- Because the stack’s base is randomized per cycle, prediction of stack start addresses (and hence reliable shellcode placement) is infeasible.
5. Security Properties and Formal Guarantees
5.1 Resistance to Return2libc
- Critical privileged functions reside in execute-only MPU regions and are accessible solely in privileged mode.
- Any attempt by unprivileged code to return to these addresses triggers an MPU fault, terminating the attack.
5.2 Resistance to ROP
- Privileged gadgets cannot be accessed, as they are protected by the MPU.
- Unprivileged gadget chains are unable to modify MPU-protected code or data, and attempts to do so cause faults.
5.3 Resistance to Return2shellcode
- The process stack pointer’s randomization per cycle makes stack addresses unpredictable.
- Even if an attacker manages to inject code and overwrite the return address, the probability of hitting a mapped and executable region is negligible.
5.4 Formal CFI Inductive Proof
Let denote all allowed CFG edges in cycle . The security argument proceeds as follows:
- Base case: Prior to , MPU enforces only.
- Inductive step: During , only transitions in occur, as MPU faults prevent others; post-cycle validation checks the consistency of privileged updates.
- At , the process repeats with a new MPU configuration. Consequently, no illicit edge can ever be traversed.
6. Performance Metrics and System Footprint
The optimized per-cycle granularity of CMP achieves substantial efficiency benefits compared to traditional per-task schemes (e.g., MINION):
| Configuration | CMP (CToMP) | Task-Oriented (MINION) |
|---|---|---|
| Per-cycle overhead | 117 μs | 420 μs |
| Memory pool (static) | 5,632 B | ≥N/A |
| Dynamic region tracking array | ≤72 B | N/A |
| Total extra SRAM | ≈5.7 KB | N/A |
| MPU regions used | z≈8 | N/A |
- On Ardupilot at 400 Hz, both baseline and CMP support all 49 tasks at required frequency; MINION drops low-priority tasks by up to 50% frequency.
- On Crazyflie, >20 tasks sustain zero measurable frequency loss under CMP.
- Even the worst-case total dynamic allocation time per cycle is less than the 2.5 ms cycle duration at 400 Hz, and allocation time can be reduced by 26.5% when allocating large buffers first.
CMP thus maintains strong CFI and prohibits MCAs while imposing overheads of only 100–120 μs per cycle and 6 KB of RAM, well within the SRAM needs of platforms with 192–512 KB.
7. CMP in High-Concurrency Queues: Coordination-Free Bounded Reclamation
Independent of embedded systems, the CMP paradigm underlies the design of high-concurrency lock-free FIFO queues in the context of large-scale AI and parallel systems (Motiwala, 12 Nov 2025). CMP for lock-free queues is characterized by:
- Dual protection: Each node is protected by a state machine (AVAILABLE CLAIMED reclaimed) and a cycle-based sliding window.
- Memory reclamation: No node is reclaimed until it has been out of reach for dequeue cycles; this window is finite, tunable, and independent of thread count.
- Queue invariants: Strict FIFO is upheld; enqueue linearizes on successful pointer addition, dequeue on successful state transition, and reclamation on fully passing out of the protection window.
- Performance: CMP outperforms production lock-free queues under heavy contention, e.g., at 64 producer/consumers, providing +892% throughput over Moodycamel and +325% over Boost.Lockfree. Average enqueue/dequeue latency is also substantially lower, with better robustness to synthetic delay or OS jitter.
CMP’s bounded, coordination-free reclamation enables strict queue semantics with lock-free progress and practical memory safety, at the cost of retaining memory proportional to window size . Potential extensions include adaptive windows, segmented design, and crash-recoverable variants.
CMP thus unifies a family of defense and reclamation strategies that provide provable temporal and spatial safety, amortized system overhead, and robustness across both real-time embedded and highly concurrent data-structure settings (Ma et al., 2023, Motiwala, 12 Nov 2025).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free