Contextual Memory Virtualisation (CMV)

Updated 4 July 2026

Contextual Memory Virtualisation (CMV) is a design paradigm that customizes memory operations based on explicit contextual cues to enforce invariants like deterministic latency and isolation.
It is applied across varied systems including real-time RISC-V virtualization, hypervisor-level segmentation, metadata-augmented memory traffic, and LLM session state management.
CMV achieves predictable access behavior and fine-grained sharing by dynamically binding memory views to system contexts while managing metadata overhead and preserving performance guarantees.

Searching arXiv for the specified papers and closely related work on Contextual Memory Virtualisation. Contextual Memory Virtualisation (CMV) denotes a family of mechanisms that make memory behavior depend explicitly on context rather than treating memory as a uniform, context-blind substrate. In the current literature, the term and closely aligned formulations span deterministic address redirection for virtual machines in real-time RISC-V systems (Walluszik et al., 6 Apr 2025), hardware mechanisms that restore program context to memory-visible traffic (Roberts, 21 Aug 2025), hypervisor-level segmentation driven by datacenter allocation context (Teabe et al., 2020), capability-scoped isolation and sharing inside a single virtual address space (Sartakov et al., 2022), DAG-based state management and structurally lossless trimming for long-lived LLM sessions (Santoni, 25 Feb 2026), and policy-governed contextual memory fabrics for organizational reasoning and auditability (Wedel, 28 May 2025). Across these lines of work, CMV is not a single implementation but a design principle: memory layout, visibility, authority, persistence, or regeneration is specialized to the active context while preserving an explicit invariant such as deterministic timing, byte-granular isolation, schema correctness, or auditability.

1. Conceptual scope and recurring invariants

The literature uses CMV to solve different problems, but the recurring structure is stable. A context is first made explicit, then memory behavior is bound to that context, and finally a system-level invariant is enforced. In real-time hypervisors, the relevant context is the currently scheduled VM or privilege mode; in memory fabrics, it is the current region of interest, object, or function; in capability systems, it is the currently active compartment and its authority set; in LLM agents, it is the branch-specific conversational state; and in organizational systems, it is the decision trace, rationale, and lineage (Walluszik et al., 6 Apr 2025).

Domain	Mechanism	Preserved invariant
Real-time virtualization	Region-based GPA-to-PA redirection with hPMP offsets	Deterministic latency
Datacenter VM memory	DS-n segmentation instead of hypervisor paging	Near-native virtualization cost
Memory-side telemetry	Metadata packets in the read-address stream	Nondestructive observability
Capability systems	Per-context capability sets in one virtual address space	Fine-grained isolation and sharing
LLM agents	Snapshot/branch/trim over a context DAG	Structurally lossless reuse
Organizational memory	Insight Layer with capture, drift detection, regeneration	Traceability and governance

This breadth leads to a common misconception: CMV is not limited to virtual address translation. Some work does use the term in an address-space sense, but other work applies the same principle to observability, isolation, versioned conversational state, and audited institutional memory. A plausible implication is that CMV is best understood as a contextual control plane for memory, not as a synonym for paging or segmentation alone.

2. Deterministic spatial virtualization in processors and hypervisors

A hardware-oriented form of CMV appears in RISC-V real-time virtualization through hPMP-based address redirection. In that design, each VM executes with virtualization enabled and uses guest physical addresses; hPMP entries match GPA ranges and, on a hit, compute a physical address through a fixed per-region offset while enforcing R/W/X permissions (Walluszik et al., 6 Apr 2025). The core formalization is region-grained:

$R_k(v) := v \in [V_k^{low}, V_k^{high})$

$p(v) := v + O_k$

$allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$

If $allow$ holds and $V=1$ , the access uses $p(v)$ ; if $V=0$ , offsets are ignored and addresses remain physical. The mechanism is implemented with PMP-like comparators plus a single adder, without page tables, TLBs, or access-history-dependent caches, and therefore eliminates page faults and preserves a constant-time path (Walluszik et al., 6 Apr 2025). The timing bound is expressed as:

$T_{access} \le T_{base} + T_{match} + T_{add}$

This design realizes context-specific virtual layouts because the hypervisor selects which hPMP regions are active for the scheduled VM via hpmpswitch and S bits, and can relocate a VM image by programming hpmpoffsetx without rebuilding the binary. The consequence is MPU-like determinism with virtualization semantics.

A closely related but distinct line appears in CVA6-VMRT, which targets time-predictable virtual memory in a 64-bit application-class RISC-V processor. Here CMV is implemented by software-controlled partitioning and locking of fully associative I/D TLBs and by runtime partitioning of L1 instruction and data caches into cache and scratchpad regions (Reinwardt et al., 8 Apr 2025). For protected translations, the latency reduces to one cycle; for protected code and data placed in SPM, access time becomes constant. The paper formalizes address-translation latency for context $i$ as:

$L_{AT,i} = \begin{cases} 1 & \text{if VPN/ASID or VPN/VMID matches a locked entry and its partition is enabled} \ L_{miss} & \text{otherwise} \end{cases}$

and for protected regions yields a constant memory-access bound of $p(v) := v + O_k$ 0 (Reinwardt et al., 8 Apr 2025). Under virtualization, the mechanism reduces execution-time variability for a real application benchmark by 94% during interference from a non-critical guest, with a 3.7% whole-core area overhead and no frequency penalty (Reinwardt et al., 8 Apr 2025).

In cloud hypervisors, CMV appears again in Compromis, which replaces hypervisor-level paging with direct segments while leaving guest OS paging intact. The argument is empirical: VM memory allocation in virtualized datacenters is coarse-grained, infrequent, and concentrated into few sizes, so the fragmentation assumptions that justify paging at process granularity do not hold at VM granularity (Teabe et al., 2020). The DS-n mapping computes

$p(v) := v + O_k$ 1

after a conventional guest 1D page-table walk, thereby eliminating the nested EPT walk entirely (Teabe et al., 2020). The paper reports that 99.99% of VMs use $p(v) := v + O_k$ 2 and that $p(v) := v + O_k$ 3 suffices for 100% of VMs across evaluated traces; mean overhead is approximately 0.35% with 0.42 standard deviation, and VM startup latency is reduced by up to 80% (Teabe et al., 2020). In this setting, CMV is the context-driven choice of segmentation rather than paging for the GPA-to-HPA mapping.

Taken together, these works define a strong hardware and systems interpretation of CMV: context selects the active translation or protection regime, but the mechanism remains analyzable and bounded. The main trade-off is flexibility. hPMP is region-based rather than page-based; CVA6-VMRT protects only the regions that fit into locked TLB entries and SPM ways; Compromis depends on contiguous host segments and scheduler cooperation.

3. Restoring context to memory-visible traffic

A second branch of CMV does not virtualize addresses so much as it virtualizes memory policy space by restoring software intent to the memory side. In this formulation, the problem is that requests arriving at main memory are stripped of programmer-visible context; memory devices see commands, addresses, and data, but not functions, objects, priorities, or phases (Roberts, 21 Aug 2025). The proposed mechanism encodes metadata nondestructively into the read-address stream by reserving a mailbox window inside the physical address space and placing packet payload bits into the address bits above the cache-line offset.

Messages consist of two data packets followed by a checksum packet; CRC validation identifies both membership and ordering. The decoder uses a preamble-discovery phase and then a sliding 8-request window that tests all three-packet permutations via CRC (Roberts, 21 Aug 2025). For CRC-16 and $p(v) := v + O_k$ 4, the model gives

$p(v) := v + O_k$ 5

per window, while the prototype reports zero packet loss and zero false positives with 16-bit packets and CRC-16 in both simulation and hardware traces (Roberts, 21 Aug 2025). The mechanism requires no privileged driver, uses standard read requests to memory already allocated to the process, and overlays the mailbox onto existing application data ranges without modifying contents.

The significance for CMV is explicit in the policy layer. Function and loop markers enable precise segmentation of traces by region of interest; allocator wrappers map object IDs and virtual ranges into physical-space maps; and memory-side hardware can then prioritize, throttle, tier, or reconfigure according to the current object or function (Roberts, 21 Aug 2025). In this view, runtime memory behavior becomes programmable by application context and hints rather than by opaque traffic statistics alone.

This branch corrects another common misconception: CMV need not mean that each context sees a different address map. It can also mean that the memory system sees the context that produced the traffic and can therefore virtualize prioritization, telemetry, migration, or tiering per application, object, or phase.

Capability systems provide a more authority-centric interpretation of CMV. In CAP-VMs, cVMs are VM-like compartments that share a single virtual address space but are restricted by per-context capability sets installed into ddc and pcc; every load, store, and control transfer is thereby constrained to code and data bounds specific to that compartment (Sartakov et al., 2022). The enforcement condition is:

$p(v) := v + O_k$ 6

and monotonicity of derived capabilities ensures that bounds and permissions can only be reduced (Sartakov et al., 2022). Sharing is performed by capability transfer rather than page-table modification. Two core primitives are provided: CAP_File, an asynchronous read-write interface to shared buffers with byte-granular access, and CAP_Call, a sealed cross-compartment control-transfer primitive. CAP_Stream is built atop them.

The architectural consequence is that multiple mutually untrusted services can coexist in one address space while sharing arbitrary-sized buffers without repeated kernel or hypervisor mediation. The prototype reports median latency reductions of 20–40% versus Docker containers in an NGINX/Redis pipeline, up to 54% latency reduction for Redis validation, and CAP_File throughput within approximately 6% of memcpy peak throughput on FPGA and essentially near memcpy on SiFive (Sartakov et al., 2022). Here CMV is not about translating the same address differently for different contexts; it is about giving each context a distinct authority projection over the same address space.

A lower-level hardware analogue appears in context-switching/dual-context memory based on standard 8T SRAM. In that design, each cell simultaneously stores one mutable RAM bit and one immutable ROM bit, the latter encoded by the threshold voltage choice of the read-port transistors (Kaiser et al., 2023). A two-phase dual-context read first senses RAM with SL = 0 V, then senses ROM by rebiasing SL according to the previously read RAM value. The paper reports robust operation across 5000 Monte Carlo runs per case, delay overheads from 1.25× to 1.95× depending on mode, read-energy overhead from 1.03× to 1.08×, leakage of 0.997× baseline, and up to 1.3× storage-density improvement relative to separate SRAM+ROM solutions (Kaiser et al., 2023).

These two examples differ radically in abstraction level, but both embody the same CMV idea: the same physical substrate is interpreted differently according to context. In cVMs, the interpretation is an authority set. In dual-context SRAM, it is the active read mode and bias regime. A plausible implication is that CMV can be formulated either as a control problem over permissions or as a mode problem over storage semantics.

5. Context as reusable state in LLM agents and organizational systems

The term CMV is also used directly for long-lived LLM-agent sessions. In this setting, the accumulated conversational understanding of architecture, trade-offs, conventions, and tool-derived state is treated as version-controlled memory rather than transient prompt text (Santoni, 25 Feb 2026). Session history is modeled as a directed acyclic graph $p(v) := v + O_k$ 7 of immutable snapshots and branch relations, with formally defined snapshot, branch, and trim operators. The crucial trim operator is a deterministic mapping over token sequences that is “structurally lossless” with respect to user and assistant messages: every user message and assistant response is preserved verbatim and in order, while mechanical bloat such as raw tool outputs, base64 images, queue operations, and non-portable thinking blocks is removed or stubbed (Santoni, 25 Feb 2026).

The three-pass streaming trimmer yields a mean token reduction of 20% and up to 86%, with mixed tool-use sessions averaging 39% reduction and reaching break-even within 10 turns under prompt caching (Santoni, 25 Feb 2026). The economic analysis defines steady-state per-turn cost as

$p(v) := v + O_k$ 8

and break-even turns as

$p(v) := v + O_k$ 9

with concrete examples giving 10 turns for a 33% reduction and 6 turns for a 46% reduction under the stated pricing model (Santoni, 25 Feb 2026). This is a version of CMV in which context is neither address space nor protection state, but accumulated model-understanding made branchable, reusable, and reproducible.

A broader institutional generalization appears in Contextual Memory Intelligence, where contextual memory is treated as adaptive infrastructure that captures rationale, assumptions, alternatives, and lineage across tools, roles, and time (Wedel, 28 May 2025). The proposed Insight Layer includes a Context Extractor, Insight Indexer, Drift Monitor, Regeneration Engine, and Reflection Interface. It operationalizes contextual capture, retention, regeneration, and human-in-the-loop reinterpretation, while governance is enforced through RBAC, retention policies, audit trails, and mechanisms to flag or redact sensitive insights (Wedel, 28 May 2025). Several formal quantities are introduced:

$allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$ 0

for contextual entropy,

$allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$ 1

for semantic drift, and

$allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$ 2

for alignment between current reasoning and prior context (Wedel, 28 May 2025).

These works extend CMV beyond computer architecture. The memory being virtualized is now conversational or organizational state, and the invariant is no longer worst-case latency but structural fidelity, regeneration fidelity, or auditability. The shift is substantial, but the underlying pattern remains: context is externalized, versioned, and given an explicit control plane.

Several related mechanisms illuminate the broader design space without using the same formulation in exactly the same way. Contextual Memory Trees learn a hierarchical routing structure over an unbounded key-value memory so that a query is mapped to a logarithmic-depth path and a leaf containing only $allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$ 3 memories (Sun et al., 2018). If routers induce a $allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$ 4-balanced partition, worst-case complexity is $allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$ 5 for query and $allow(v, op) \Leftrightarrow (\exists k: R_k(v) \land perms_k(op))$ 6 for insertion (Sun et al., 2018). This is a learned, context-dependent addressing scheme over memory contents and can be read as an algorithmic precursor to CMV-style contextual routing.

Across the surveyed systems, the main trade-offs recur. Region-based real-time schemes gain analyzability but lose the fine-grained remapping flexibility of page-based MMUs (Walluszik et al., 6 Apr 2025). TLB locking and SPM partitioning give hard guarantees only for pages and code/data that fit into the protected budget (Reinwardt et al., 8 Apr 2025). Hypervisor segmentation depends on contiguous host memory and coordinated scheduling (Teabe et al., 2020). Memory-side metadata injection adds extra read traffic and requires mailbox alignment and robust decoding under prefetch noise and reordering (Roberts, 21 Aug 2025). Capability-based sharing relies on CHERI hardware support and restricts capability storage to simplify revocation (Sartakov et al., 2022). LLM-session CMV preserves structure rather than performing semantic compression, so trimmed branches may still require re-reading stripped raw artifacts when those artifacts become relevant again (Santoni, 25 Feb 2026). Organizational CMV introduces governance and reflection machinery, but also raises privacy, compliance, and context-overload concerns (Wedel, 28 May 2025).

A second recurrent issue is that CMV almost always exchanges generic flexibility for contextual guarantees. The guarantee may be constant translation latency, low and predictable access time, byte-granular sharing, exact lineage, or branchable session state. The price is typically bounded region counts, hardware specialization, stricter schemas, extra metadata plumbing, or explicit governance.

The literature therefore suggests a stable encyclopedic characterization of CMV. It is a design paradigm in which memory is not merely stored and accessed, but is selectively shaped, exposed, or regenerated according to context. The “virtualization” lies in decoupling what a context observes or can act upon from the underlying physical, protocol-level, or historical substrate; the “contextual” component lies in binding that decoupling to a formally recognized execution state, software phase, authority set, conversational lineage, or organizational trace. Across hardware, systems, and agentic AI, CMV is best understood as a family of context-aware memory-control mechanisms that replace one-size-fits-all memory behavior with explicitly governed, analyzable, or reusable context-specific views.