RunPBA: Runtime Attestation for MCUs
- RunPBA is a hardware-based runtime attestation framework for microcontrollers that leverages ARM PACBTI to enforce control-flow integrity in resource-constrained systems.
- It integrates seamlessly with industry standards like ARM PSA and Trusted Firmware-M, imposing minimal performance (≈1.1%-4.7%) and energy overhead while eliminating the need for additional hardware.
- The framework mitigates sophisticated control-flow attacks such as ROP, JOP, and FOP by providing real-time attestation and fault reporting without requiring shadow stacks or extra memory.
RunPBA is a hardware-based runtime attestation framework designed for microcontrollers, leveraging the Pointer Authentication and Branch Target Identification (PACBTI) extension introduced in ARMv8.1-M. Its primary goal is to provide robust, low-overhead defenses against control-flow attacks in resource-constrained embedded environments without requiring custom hardware modifications, while integrating seamlessly with industry-standard attestation mechanisms such as ARM PSA and Trusted Firmware-M (Cirne et al., 14 Dec 2025).
1. Motivation and Threat Model
The proliferation of embedded microcontrollers (MCUs) in critical infrastructure has intensified the need for protecting these systems from control-flow hijacking attacks, including Return-Oriented Programming (ROP), Jump-Oriented Programming (JOP), and Function-Oriented Programming (FOP). Constraints typical to MCUs—low energy consumption, strict timing requirements, and limited computational resources—render conventional software-based Control-Flow Integrity (CFI) and attestation schemes unsuitable due to prohibitive overhead or dependence on non-standard silicon.
RunPBA specifically aims to:
- Detect all runtime deviations from legitimate control flow, closing “time-of-check, time-of-use” (TOCTOU) races inherent to traditional attestation models.
- Operate exclusively with off-the-shelf ARM Cortex-M silicon supporting PACBTI, with no hardware additions.
- Limit runtime, energy, and code-size overhead to a few percent.
- Interoperate with attestation standards including PSA and Trusted Firmware-M.
The threat model assumes adversaries may read or write arbitrary memory in the Non-Secure Processing Environment (NSPE) outside Memory Protection Unit (MPU) and Data Execution Prevention (DEP) controls, including mounting ROP/JOP/FOP and PAC tag brute-force attacks. Physical, side-channel, and Secure Processing Environment (SPE) attacks, and compromise of long-term attestation keys, are out of scope (Cirne et al., 14 Dec 2025).
2. PACBTI Extension on ARM Cortex-M
The PACBTI extension, available on ARM Cortex-M85/M52 via ARMv8.1-M, integrates Pointer Authentication (PA) and Branch Target Identification (BTI) into the instruction set:
- Pointer Authentication (PA): Every code pointer is “signed” with a 32-bit Pointer Authentication Code (PAC) generated by a keyed MAC (e.g., QARMA). The PAC is stored in a reserved register. When the pointer is used, the MAC is recomputed with context-dependent modifiers (typically the stack pointer) and checked before use. A mismatch triggers a UsageFault exception.
- Branch Target Identification (BTI): Each valid indirect branch target must be marked with a BTI “landing pad” instruction. Indirect branches to arbitrary instructions without the BTI opcode are prevented at the hardware level, enforcing that control-flow transfers only to marked function entries.
- Pipeline Integration: With PACBTI-enabled toolchains (LLVM-17, ArmCC), function calls and returns are instrumented to add pointer authentication and branch safety checks automatically. The PA hardware stage executes MAC operations (5–10 cycles per PAC/AUT instruction); BTI checks occur as a single pipeline cycle without explicit instruction overhead (Cirne et al., 14 Dec 2025).
3. End-to-End Runtime Attestation Workflow
RunPBA extends the ARM PSA Initial Attestation service (Trusted Firmware-M’s Secure Partition) to include real-time control-flow integrity status:
- Attestation Protocol:
- The verifier sends a nonce-challenge to the device.
- The attestation service in the Secure Processing Environment queries the RunPBA Application Root-of-Trust for current PACBTI status and fault history, and reads NSPE control registers.
- Claims are packed into a CBOR-Web-Token, including security lifecycle bits (e.g., NSPE compromised on PACBTI fault, RunPBA own error status), PAC/BTI enablement flags, the received nonce, and static firmware measurements.
- The token is signed with a device-unique key and returned to the verifier, which checks the signature, nonce, and security status bits.
- Fault Management:
- Hardware faults (UsageFault) resulting from PAC/BTI violations in NSPE are escalated via modified SPE HardFault handlers and logged by RunPBA’s partition for attestation reporting.
- Token Generation Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
function GenerateAttestation(nonce N):
state = RunPBA.get_state()
ctlr = Read(NSPE.PACBTI_CTRL)
claims = {
"lifecycle": (plat_lc <<8) | (runpba_state),
"pac_enabled": ctlr.PA,
"bti_enabled": ctlr.BTI,
"nonce": N,
...
}
token = CBOR_Web_Token(claims)
sig = Sign(device_priv_key, token)
return (token, sig) |
4. Implementation Architecture
The RunPBA proof-of-concept is implemented as an Application Root-of-Trust (RoT) partition in Trusted Firmware-M v1.5, evaluated using Arm Corstone-310 FVP and Renesas EK-RA8M1 (Cortex-M85):
- TF-M Integration: Minor modifications are required to the TF-M core, redirecting UsageFaults from NSPE as SPE HardFaults and extending attestation code to incorporate PACBTI status via the RunPBA partition.
- RunPBA Partition: Responds to PACBTI faults with a fast-latency first-level interrupt handler (FLIH) to snapshot context, and a deferred second-level handler (SLIH) to securely persist fault evidence.
- Initialization: At boot, PACBTI keys are generated from the TF-M random number generator, PAC/BTI are enabled, and PAC tags are calculated as per ISA requirements.
- Toolchain: Builds employ LLVM 17.0.1 with appropriate CPU/mattr flags and ARM GNU Toolchain 13.2.
The incremental memory footprint due to PACBTI code transformations is approximately 6–7% (including RunPBA’s 16 kB TF-M API stubs) (Cirne et al., 14 Dec 2025).
5. Performance Evaluation
The efficiency and practicality of RunPBA are demonstrated through detailed benchmarking:
- Benchmarks: BEEBS (30 integer/memory/control microbenchmarks) and CoreMark-Pro (industry standard for embedded CPUs) on Renesas EK-RA8M1 with Nordic Power Profiler Kit II.
- Measurement Methodology: Both suites compiled with optimizations disabled (–O0) to maximize branch diversity; all benchmarks run with PACBTI both enabled and disabled.
- Calculation of Overhead: Time and energy overheads computed as geometric means of the ratio of PACBTI-enabled to baseline execution for each test, excluding any artificial decreases.
- Results:
- BEEBS: 4.7% time and 4.8% energy overhead (max test: 28% for heavy recursion), code size increase +6.8%.
- CoreMark-Pro: 1.1% time and 1.0% energy overhead (max: 2.6% for large FFT), code size increase +4.8%.
A key comparative table highlights RunPBA’s hardware simplicity and code efficiency versus prior hardware-assisted CFI approaches (Silhouette, SuM):
| Solution | BEEBS Overhead | CoreMark-Pro Overhead | Code Size Overhead | Extra Memory |
|---|---|---|---|---|
| Silhouette [Zhou20] | 3.4% | 1.3% | 16.2% | parallel shadow stack |
| SuM [Choi24] | 2.8% | 2.6% | 8.8% | compact shadow stack + DWT |
| RunPBA (PACBTI) | 4.7% | 1.1% | 6.8% | none (uses r12 register) |
RunPBA eliminates the need for shadow stacks or extra regions, leveraging in-silicon hardware MACs for pointer authentication (Cirne et al., 14 Dec 2025).
6. Limitations and Future Work
Notable limitations include:
- Exposure to FOP-style attacks, as BTI permits indirect branches to any function prologue unless prevented by compiler hardening.
- Potential TOCTOU exploitation if privileged NSPE code disables PACBTI and reenables it before the next attestation, since RunPBA only checks status at attestation time.
- Absence of hardware watchpoints for PACBTI control registers, preventing real-time enforcement of register integrity.
- Necessity for TF-M modifications, particularly for correct fault escalation handling.
Areas for future research involve:
- Selective PAC/BTI instrumentation via per-CFG analysis to further minimize code overhead.
- Automated FOP detection/mitigation.
- Extension to PAC-based data-flow attestation.
- Formal verification of RunPBA SPE handler and attestation APIs (Cirne et al., 14 Dec 2025).
7. Summary and Impact
RunPBA establishes that robust runtime attestation and CFI enforcement on MCUs is achievable with marginal overhead when leveraging mainstream ARM PACBTI extensions and industry-compliant attestation frameworks. This approach obviates the need for bespoke silicon or invasive memory modifications, significantly advancing practical security guarantees for next-generation embedded systems (Cirne et al., 14 Dec 2025).