Papers
Topics
Authors
Recent
Search
2000 character limit reached

Sandlock: Confining AI Agent Code with Unprivileged Linux Primitives

Published 25 May 2026 in cs.CR and cs.OS | (2605.26298v1)

Abstract: AI agents increasingly run untrusted code on developer machines: shell commands generated by LLMs, third-party scripts retrieved at runtime, and tool plugins of unknown provenance. Existing isolation mechanisms impose tradeoffs that fit this workload poorly: containers and microVMs add privilege, image-management, and startup costs, while ad-hoc process controls and wrappers (e.g. chroot, ulimit) provide weak guarantees and little syscall-level control. Sandlock is a lightweight Linux process sandbox organized around a simple split: static, input-independent policy is compiled into kernel-enforced rules, while a narrow supervisor handles runtime-dependent decisions and virtualized effects. This split lets Sandlock enforce filesystem, network, IPC, and syscall policies without root, cgroups, images, or mandatory namespaces. It also supports dynamic network decisions, HTTP-level access control, TOCTOU-safe inspection of execve arguments, and reversible filesystem effects. On our workstation, Sandlock adds roughly 5 ms of startup overhead and runs Redis at bare-metal throughput (within measurement noise); its pipeline operator further supports per-stage confinement for separating data, network, and untrusted-content capabilities. Sandlock is available at https://github.com/multikernel/sandlock

Authors (2)

Summary

  • The paper introduces a novel unprivileged sandbox (Sandlock) that confines AI agent code using Linux primitives with kernel-enabled enforcement.
  • It employs a split policy model that combines static Landlock and seccomp rules with runtime supervisor mediation to control filesystem, network, and process access.
  • Benchmarks demonstrate near bare-metal performance, with approximately 5 ms startup latency and efficient copy-on-write operations achieving ~1900 forks/s.

Sandlock: Unprivileged Linux Primitives for Agentic Code Confinement

Motivation and Threat Model

The increasing prevalence of AI agent systems executing untrusted code (e.g., model-generated shell commands, runtime-fetched scripts, or plugin invocations) exposes developer workstations to supply-chain risks. The conventional defensesโ€”root-dependent containers, microVMs, and ad-hoc UNIX process controlsโ€”fail to meet agent-centric requirements of low latency, strong syscall-level control, and fine-grained, programmable confinement. Sandlock is introduced to address these deficits with an enforceable, unprivileged policy substrate for confining agentic workloads. The threat model assumes frequent, short-lived invocations of potentially attacker-influenced code on developer machines lacking elevated privileges, with network access essential but hazardous, and dynamic multi-stage workflows where staged capability separation is critical (as in prompt injection countermeasures). Sandlock aims to shift confinement from static judgment (LLM policy) to kernel enforcement.

Policy Model: Static and Runtime Enforcement

Sandlockโ€™s policy model distinguishes static, input-independent rules (e.g., readable/writable path prefixes, unconditional syscall denials, TCP port/IPC constraints) from runtime-dependent policies (e.g., destination of network calls, execve argv inspection, HTTP method/path controls, and reversible filesystem effects). Static invariants are compiled into Landlock and seccomp-bpf rules, enforced directly by the kernel. Runtime-dependent decisions are mediated by a narrow supervisor using seccomp user notification with pidfd_getfd, enabling programmable per-syscall intervention (e.g., revoking network access after phase transitions detected via execve argv, dynamically inspecting HTTP requests, facilitating copy-on-write filesystem captures).

Sandlock supports multi-stage pipeline composition, allocating distinct policy confines per process pipeline stage, permitting explicit agentic separation (e.g., one stage with access to private data but no network, and another stage with network but no data access).

Implementation Architecture

Sandlock is instantiated as a Rust process sandbox leveraging Landlock (ABI 6+) for static filesystem, TCP port, and IPC policies, seccomp-bpf for unconditional syscall denials, and seccomp user notification for supervisor-mediated runtime decisions. The confinement pipeline applies after fork(), ensuring the supervisor is attached before user code executes. Static policy is strictly enforced at the earliest opportunity, and TOCTOU hazards are precluded by refusing to run below ABI requirements and freezing process aliases during inspection.

The supervisor intercepts syscalls including clone/fork/vfork/clone3 (enforcing process caps), mmap/brk (memory caps), connect/send (network ACLs), bind (on-behalf binding), openat/rename/unlinkat (copy-on-write filesystem effects), and execve (policy transition). For network, Sandlock distinguishes between kernel-enforced direct paths (port-only policies) and supervisor-mediated โ€œon-behalfโ€ paths (host-specific, HTTP-level policies), including opt-in TLS inspection with local proxy and sandbox CA for HTTP method/path granularity.

Resource bounds are maintained via cooperative syscall interposition rather than cgroups. Memory caps are applied on address space allocation; process creation is strictly gated; CPU throttling uses signal cycling. These limits suffice for trusted-but-buggy code; stronger isolation for adversarial resource exhaustion requires cgroups or VM boundaries.

Programmable Policy and TOCTOU Safety

Sandlock exposes a runtime callback (policy_fn) in Rust or Python, permitting host applications to observe syscall events (including execve argv) and adjust policies live. This mechanism is proven TOCTOU-safe: Sandlock freezes sibling tasks before inspection and denies syscalls if unsafe. Argv inspection is strictly an observation/gating signal; containment is enforced via kernel policy, not user-level matching. Runtime verdicts include allow/deny/audit, and policy tightening is instantaneous, affecting subsequent execution.

Copy-on-Write Workspaces and Stage Composition

Sandlock delivers two unprivileged COW backends: a seccomp-intercepted layer (userspace) redirecting writes and merging reads, and BranchFS (dedicated COW filesystem), both supporting commit/discard/dry-run operations without mount namespaces or root. The COW-fork primitive allows efficient map-reduce patterns, sharing initialized pages and sustaining ~1900 forks/s with supervisor-mediated safe registration.

Pipeline composition connects stages via UNIX pipes, enforcing per-stage confinement as kernel policy boundaries. This enables dual-LLM/CaMeL capability separation, where the stage exposed to untrusted content is network denied and vice versa. However, data provenance and sound decomposition remain the authorโ€™s responsibility.

Evaluation

Strong quantitative results: Sandlock achieves a startup overhead of ~5 msโ€”44ร— faster than Docker, and matches bare-metal Redis throughput (medians: 75.2k vs. 75.5k rps, p99 latency: 0.51 vs. 0.49 ms), while Docker suffers significant overhead (only ~76% throughput, 3ร— p99 tail latency relative to bare metal). The COW-fork primitive delivers ~1900 forks/s, with overhead dominated by policy-fn-safe child registration. Supervisor-mediated network adds ~35 ฮผs round-trip cost for 256 B requests, immaterial for agentic API call workloads. Effectiveness checks confirm strict denial of out-of-scope reads/writes, enforce network and data separation per pipeline stage, and reliably gate process and memory caps.

(Figure 1)

Figure 1: Preliminary benchmark results. Sandlock preserves low startup latency and near-bare-metal Redis throughput while avoiding Docker's per-invocation and tail-latency costs on this workstation.

Practical and Theoretical Implications

Sandlock demonstrates the feasibility of unprivileged, programmable agentic confinement with near-native performance, supporting robust capability separation, filesystem effect reversibility, and HTTP-level network controls. The split enforcement model can serve as a substrate for prompt injection countermeasures, structural agent decomposition, and speculative agent rollbacks. The architecture suggests that strong security guarantees are achievableโ€”without privilegeโ€”by maximizing kernel-enforceable static policy and restricting supervisor mediation to runtime-dependent actions. The pipeline-and-COW composition primitives pave the way for multi-stage, multi-branch agentic workloads with enforced provenance boundaries.

Future developments may include integration with agent gateways (e.g., Agentry), buffered output gating for HTTP effects, richer multi-branch pipelines for agentic exploration/rollback, and broader compatibility studies. However, kernel-level vulnerabilities, side channels, and resource exhaustion by adversarial tenants remain out of Sandlockโ€™s scope; in such cases, microVMs or cgroup-driven isolation should be preferred.

Conclusion

Sandlock provides an unprivileged Linux sandbox tailored to agentic code confinement, emphasizing a split enforcement model, programmable runtime policies, copy-on-write workspaces, pipeline composition, and minimal performance overhead. The architecture advances process-level capability separation for AI agent workloads, offering a substrate for structural prompt injection resistance and efficient speculative exploration. Sandlock is open source and suitable for workstation-level retrofitting in both wrap-the-child and self-confinement modes, signifying a substantial advance in agent-centric operating system security (2605.26298).

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 4 likes about this paper.