Sandlock: Lightweight Linux Sandbox
- Sandlock is a lightweight unprivileged Linux sandbox designed to confine short-lived, untrusted AI-generated code workloads using a novel split-enforcement architecture.
- It enforces static policies via kernel mechanisms (Landlock and seccomp-bpf) and dynamic decisions through a minimal user-mode supervisor to achieve sub-10ms startup.
- Sandlock offers reversible filesystem effects, fine-grained syscall control, and dynamic network policies to mitigate supply chain exploits in developer environments.
Sandlock is a lightweight, unprivileged Linux sandbox designed to confine the bursty, short-lived, and often unvetted workloads spawned by AI agents, notably those generating, fetching, or executing code without direct human review. Sandlock addresses the security and workflow challenges present in running untrusted code on developer workstations—where traditional containers and microVMs impose prohibitive startup latency and privilege requirements, while ad-hoc wrappers fail to provide fine-grained, syscall-level isolation. Sandlock features a split-enforcement architecture: input-independent policy is compiled into kernel-level enforcement (using Landlock and seccomp-bpf), while runtime-dependent decisions are delegated to a minimal user-mode supervisor. This combination enables sub-10ms startup, fine-grained access control, reversible filesystem effects, and dynamic network controls, all without root, cgroups, or privileged namespaces (Wang et al., 25 May 2026).
1. Problem Space and Motivation
The proliferation of AI assistants and autonomous agents in developer workflows introduces the risk of executing untrusted code fragments, package scripts, or plugins—potential vectors for supply chain exploits (e.g., destructive shell commands or covert exfiltration payloads). Existing isolation primitives fall into two principal categories:
- Heavyweight virtual boundaries: Containers, microVMs, and user-mode kernels provide robust isolation but at the cost of complex image management, privilege escalation, and startup latencies on the order of hundreds of milliseconds. This latency is disruptive when handling high-frequency, short-lived agent invocations.
- Ad-hoc process wrappers: Tools like chroot, ulimit, firejail, and bubblewrap are fast and unprivileged but lack fine-grained syscall control, have inconsistent filesystem semantics, and cannot natively express granular network or HTTP-level access policies.
The core requirements are rapid startup (single-digit milliseconds), least-privilege enforcement at syscall and filesystem levels, network policy at IP/port and HTTP route granularity, runtime-tightenable policies, and reversible filesystem transactions—all within user space (Wang et al., 25 May 2026).
2. Split-Enforcement Architecture
Sandlock organizes confinement via a split between static and dynamic policy enforcement:
- Static, input-independent policy: All access control decisions that can be computed a priori (from command-line or configuration) are compiled into kernel-enforced rules. This includes filesystem access, TCP port controls, and unconditional syscall deny lists.
- Dynamic, runtime-dependent policy: Syscalls requiring context-specific validation (e.g., based on resolved network addresses, HTTP methods, execve arguments) are intercepted and handed off to a user-mode supervisor for decision.
Confinement pipeline after fork(2):
- The child process sets its own PGID and optional chdir.
- The
PR_SET_NO_NEW_PRIVSflag is set to block execve from granting new privileges. - Landlock rules are installed to enforce filesystem (
fs_readable,fs_writable,fs_denied), TCP-port, and IPC restrictions. - A seccomp-bpf filter blocks dangerous syscalls, dynamically manages static lists, and redirects dynamic syscall classes (e.g., connect, openat, execve, send, bind) to seccomp user-notification.
- An asynchronous supervisor dequeues and processes these notifications, executing allow/deny/continue callbacks.
- The child synchronizes with the supervisor before executing the target binary.
Formally, denoting Σ as the universe of syscalls, the policy is a three-way partition:
- : Allowed outright by kernel.
- : Denied outright (EACCES or EPERM) by kernel.
- : Routed to supervisor for decision.
Landlock/Landlock LSM, seccomp-bpf, and user-notification mechanisms are used in tandem to actualize this partitioning (Wang et al., 25 May 2026).
3. Policy Language and Formalization
Sandlock presents a compact policy language spanning five domains: filesystem, network, IPC, resources, and runtime callbacks. Its grammar is formally defined in BNF-style notation:
2
Policy compilation proceeds as follows:
- Filesystem rules (FsRules) → Landlock LSM rules (ABI 6+)
- Network rules (NetRules) on ports → Landlock TCP-port RBAC
- Unconditional syscall denials → seccomp-bpf deny list
- Dynamic set () → seccomp user-notification filter
This model enables concise, machine-enforceable, and statically analyzable policy specification (Wang et al., 25 May 2026).
4. Enforcement Mechanisms
4.1 Filesystem, Network, IPC, and Syscall Controls
- Filesystem: Landlock restricts readable/writable directories. The seccomp COW backend redirects writes into an "upper" supervisor-managed directory, merging reads as necessary. BranchFS may be used for in-filesystem copy-on-write via ioctl.
- Network: Landlock manages direct TCP port access; all non-TCP or more specific network policies defer to the supervisor, which validates requests against a pinned DNS table to prevent DNS rebinding. The supervisor acts as a proxy for permitted connections.
- IPC: Landlock ABI 6 facilitates abstract socket and signal isolation.
- Syscall-level: Seccomp-bpf enforces unconditional deny policies on dangerous or otherwise forbidden syscalls.
4.2 Dynamic Network and HTTP Policies
Sandlock supports endpoint-level rules over HTTP by refining access control lists to (method, host, path) triples. HTTP requests are proxied through a local supervisor, which parses and enforces these allowlists. HTTPS inspection is possible via installation of a sandbox CA, or else restricted to pinned endpoints (Wang et al., 25 May 2026).
4.3 TOCTOU-Safe Supervisor Handling
The user-mode supervisor, implemented as a minimal async task (Tokio in Rust), ensures race-free syscall handling:
- Filesystem write syscalls are intercepted for COW.
- Network connect/send syscalls can be conditionally allowed or denied based on runtime state.
- execve calls are held in a TOCTOU-safe state: the supervisor seizes all sibling threads (via ptrace seize), reads execve arguments, potentially tightens policy, and then permits continuation. If ptrace seize fails, Sandlock denies execution to maintain integrity.
This ensures that privileged operations are never permitted without comprehensive, race-free inspection (Wang et al., 25 May 2026).
4.4 Reversible Filesystem Effects
Each execution stage in Sandlock declares one of {COMMIT, ABORT, KEEP} for handling the effects of filesystem writes:
- COMMIT: Merge write-capture ("upper") layer into the underlying ("lower") filesystem.
- ABORT: Discard the "upper" layer; no writes are persisted.
- KEEP: Persist the "upper" alongside the "lower" for host diagnosis or replay.
Let denote the current on-disk tree, the write-capture layer, and the new state post-action:
- COMMIT:
- ABORT:
- KEEP: 0 is retained adjacent to 1
This transactional model operates without mount namespaces or privileged mounts, supporting dry-runs and atomic commit/rollback (Wang et al., 25 May 2026).
5. Performance Evaluation
Performance characterization was conducted on a Pop!_OS 24.04 system (Ryzen 5 5500U, NVMe SSD, Linux 6.18):
| Metric | Bare Metal | Sandlock | Docker (rootful) |
|---|---|---|---|
| Startup overhead (ms) | 1 | 6 | ≈ 300 |
| Redis GET throughput (krps) | 73.9 | 74.2 | ≈ 56 |
| Redis throughput ratio (Sand/native) | — | 1.004 | ≈ 0.76 |
| Redis p99 latency (rel. to native) | 1× | 1× | ≈ 3× |
| COW-fork rate (forks/s) | — | 1900 | — |
| On-behalf net roundtrip (μs) | 20 | 55 | — |
Sandlock incurs a median startup overhead of 5 ms relative to bare metal and maintains bare-metal throughput for Redis, with on-behalf supervisor mediation adding only ≈35 μs per roundtrip—negligible for API invocation workloads. By contrast, Docker introduces ≈300 ms startup latency and reduced throughput for comparable workloads (Wang et al., 25 May 2026).
6. Limitations and Future Directions
Sandlock operates entirely in the unprivileged Linux sandbox space and thus cannot defend against kernel exploits, side channels, or system-wide resource exhaustion—areas where microVMs and cgroups offer superior guarantees. Cooperative resource capping is achieved via user-land trapping and signal-based preemption but may be circumvented by malicious workloads.
Identified avenues for future work include:
- Outbound-HTTP buffering to allow coordinated rollback of network and filesystem effects (inspired by Speculator/TxOS).
- Deeper pipeline composition leveraging BranchFS for transactional multi-stage sandboxes.
- Integration with Agentry-style gateways to enable semantic, API-level access controls (Wang et al., 25 May 2026).
The architecture is deliberately engineered for use cases in which speed, unprivileged operation, and flexible, fine-grained policy enforcement outweigh absolute system containment. The intentional split between static kernel policy and a minimal TOCTOU-hardened supervisor yields an isolation primitive particularly well-suited to modern, agent-driven developer environments. Sandlock is open-source and available at https://github.com/multikernel/sandlock (Wang et al., 25 May 2026).