Papers
Topics
Authors
Recent
Search
2000 character limit reached

Opal: Private Memory for Personal AI

Published 2 Apr 2026 in cs.CR and cs.AI | (2604.02522v1)

Abstract: Personal AI systems increasingly retain long-term memory of user activity, including documents, emails, messages, meetings, and ambient recordings. Trusted hardware can keep this data private, but struggles to scale with a growing datastore. This pushes the data to external storage, which exposes retrieval access patterns that leak private information to the application provider. Oblivious RAM (ORAM) is a cryptographic primitive that can hide these patterns, but it requires a fixed access budget, precluding the query-dependent traversals that agentic memory systems rely on for accuracy. We present Opal, a private memory system for personal AI. Our key insight is to decouple all data-dependent reasoning from the bulk of personal data, confining it to the trusted enclave. Untrusted disk then sees only fixed, oblivious memory accesses. This enclave-resident component uses a lightweight knowledge graph to capture personal context that semantic search alone misses and handles continuous ingestion by piggybacking reindexing and capacity management on every ORAM access. Evaluated on a comprehensive synthetic personal-data pipeline driven by stochastic communication models, Opal improves retrieval accuracy by 13 percentage points over semantic search and achieves 29x higher throughput with 15x lower infrastructure cost than a secure baseline. Opal is under consideration for deployment to millions of users at a major AI provider.

Summary

  • The paper introduces Opal, a system that achieves access-pattern privacy by combining an enclave-resident knowledge graph with ORAM-backed storage.
  • It employs oblivious dreaming to integrate continuous data maintenance within standard ORAM accesses, significantly improving throughput and lowering costs.
  • Empirical results show enhanced retrieval accuracy, reduced latency, and efficient bandwidth use compared to traditional all-in-enclave architectures.

Opal: Private Memory for Personal AI

Motivation and Problem Statement

The proliferation of personal AI systems has transformed user interaction paradigms, intensifying the need for persistent, private, cloud-hosted long-term memory. Contemporary personal AI services capture sensitive logs of user activity across heterogeneous modalities including messaging, meetings, document editing, and ambient recordings. Disclosing the content or even the access patterns of such data to untrusted providers enables substantial inference attacks, including semantic or structural leakage, even when the content is encrypted. Traditional trusted execution environments (TEEs) fall short due to rigid enclave memory limits and the leakage introduced by storage access patterns. Prior cryptographic mechanisms such as oblivious RAM (ORAM) provide access-pattern privacy but are ill-suited for agentic memory systems, which rely on complex, query-adaptive traversals across knowledge graphs and embeddings.

System Overview

Opal provides a robust architecture for privacy-preserving, scalable personal AI memory storage by decoupling all data-dependent reasoning from bulk storage, confining it to an enclave.

The system model incorporates a TEE-based enclave cluster that orchestrates query and ingestion operations, with all personal data stored encrypted and obliviously on untrusted disk. The enclave maintains only compact metadata, a lightweight knowledge graph (KG), and index state, enabling query-dependent traversals and candidate filtration entirely within a trusted boundary. Figure 1

Figure 1: System model showing how clients interact with TEEs for storage and computation, while all persistent state resides on untrusted, oblivious storage.

Figure 2

Figure 2: Opal system architecture, displaying enclave division of roles for controller, embedding, and LLM, and secure ORAM-backed storage flow.

The primary enclave (controller) manages the KG, ANN index metadata, and ORAM state. Dedicated enclaves serve embeddings and LLM inference, orchestrated by authenticated, encrypted channels. The distributed controller employs two ORAMs for embeddings and data chunks, supporting O(logN)O(\log N) oblivious accesses, which confers a significant bandwidth and cost advantage over prior all-in-enclave or constant-scan architectures.

Technical Innovations

Opal departs from conventional content-rich, disk-resident graph representations, introducing a compact, enclave-resident knowledge graph encapsulating only normalized identifiers (entity, temporal, modality, and project nodes) and relational structure, omitting full content. Queries are semantically parsed using an LLM to extract predicates, which are subsequently resolved via in-enclave graph traversal. This process yields a candidate set that is then scored with an ANN vector search constrained to these candidates. Critically, filtration is executed entirely within the enclave without incurring additional ORAM accesses or exposing adaptive traversal patterns.

This mechanism enables Opal to match the retrieval accuracy of plaintext agentic memory systems such as Graphiti, while achieving full access-pattern privacy.

Oblivious Dreaming

Continuous ingestion and agentic maintenance (compaction, index repair, summarization) are generally incompatible with ORAM’s access constraints due to their adversarially observable, workload-dependent access patterns. Opal addresses this with oblivious dreaming: maintenance is piggybacked on incidental blocks arriving during standard ORAM accesses. Operations such as expiry, index repair, cluster splitting, and summarization are performed in-place within the enclave only on data already loaded for other operations. Figure 3

Figure 3: Oblivious dreaming lifecycle, illustrating cluster growth, split, and in-stash maintenance under ORAM constraints.

TTL-based adaptive retention is enforced to maintain steady-state storage that matches personal information access distributions, yield bounded ORAM capacity, and prevent cache or stash overflow. Sleepy rebalancing propagates index corrections to avoid the access-pattern leakage of eager or scan-based repairs. Summarization compresses historical chunks into summary nodes using standard LLM calls and is handled in-band.

Synthetic Personal-Data Benchmarking

Rigorous evaluation is enabled by a Hawkes-process-driven synthetic personal data pipeline, which generates temporally correlated, causally consistent, multi-modal user streams, drawn and calibrated from empirical workplace and communication studies. This allows for multi-year evaluation of both system correctness and retrieval metrics.

Security Model and Guarantees

The formal threat model assumes a malicious provider with an uncompromised TEE boundary. All off-enclave observation—including network traffic, storage I/O, and inter-enclave communication—is adversarially observable except for content. Opal's cryptographic protocol delivers two primary guarantees:

  • Data, queries, and access patterns remain semantically indistinguishable under a public leakage function (L\mathcal{L}) exposing only operation type and fixed batch sizes.
  • Integrity and freshness for ORAM-backed storage are ensured via Merkle-style authentication, AEAD sealing, and a monotonic counter, protecting against tampering and rollback outside configurable checkpoint boundaries.

The protocol’s security is established via an indistinguishability game in the hybrid model of enclave attestation, leveraging authenticated encryption, MACs, and batched-access ORAM lower bounds.

Empirical Results

Retrieval Accuracy: On a one-year synthetic corpus (with 257K chunks), Opal achieves a 13pp accuracy improvement over ANN-only systems, matching or exceeding a state-of-the-art unsecured system (Graphiti). Particularly strong gains are recorded in person, temporal, and modality queries, reflecting the contribution of in-enclave candidate filtration via KG predicates.

Efficiency: Opal demonstrates a 29× higher throughput and 15× lower infrastructure cost than an all-in-enclave secure baseline. Query latency is $2.32$s (2.92× faster than enclave-only), and ingest latency $0.94$s (9.05× faster), with LLM inference dominating the remaining overheads.

Bandwidth: Opal incurs O(logN)O(\log N) bandwidth rather than full database scans. For a 524K entry corpus, the system requires just $1.71$ MiB per query, compared to $4.55$ GiB for the in-enclave baseline—yielding between 12×12\times and 2,700×2,700\times reduction across relevant scales.

Oblivious Dreaming: Deferred index maintenance tracks eager LIRE-style baselines within $0.1$pp, confirming the efficacy of sleepy rebalancing. Summaries, though only L\mathcal{L}0 of the store, constitute L\mathcal{L}1 of top-L\mathcal{L}2 retrievals. Expiry and repair operations never trigger extra ORAM accesses.

Implications and Future Directions

Opal closes significant gaps between privacy, scalability, and accuracy in personal AI memory systems. It advances the state of the art on several axes:

  • Enables deployment of privacy-preserving, cloud-native personal memory supporting millions of users, at competitive cost and latency.
  • Demonstrates that semantic and entity-aware retrieval accuracy is achievable without access-pattern leakage, transcending the limitations of prior oblivious ANN schemes.
  • Generalizes a modular enclave architecture that can support extensions such as recursive summarization, richer graph predicates, and GPU-backed confidential computing.

The design is robust to evolving hardware threat models and can absorb further advancements in TEE-resistant side channel defense, enclave hardening, and secure distributed attestation. Empirical evidence for scalability, maintenance correctness, and accuracy under continuous ingestion and maintenance further positions Opal for integration into production AI memory backends.

Conclusion

Opal establishes a technically substantiated architecture for access-pattern-private, enclave-backed personal AI memory at production scale. By confining all data-dependent logic to the enclave and leveraging ORAM for external storage, it achieves strong privacy without sacrificing agentic accuracy or efficiency. The architectural principles underlying Opal should inform future AI memory and personal agent deployments both in TEE-based and cryptographically hardened environments.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We're still in the process of identifying open problems mentioned in this paper. Please check back in a few minutes.

Collections

Sign up for free to add this paper to one or more collections.