Papers
Topics
Authors
Recent
Search
2000 character limit reached

Verifiable Inference Protocols

Updated 3 July 2026
  • Verifiable inference is a protocol that provides cryptographic and hardware-backed assurance that machine learning outputs correspond to specific models and inputs.
  • It incorporates diverse methods including zkSNARKs, TEE attestations, lightweight commitments, and economic incentives to secure computations and ensure non-repudiation.
  • Applications range from cloud AI compliance and decentralized inference to privacy-preserving ML services, highlighting trade-offs in performance, cost, and security.

A verifiable inference protocol enables an external party to check, with cryptographic or hardware-backed assurance, that the output of a machine learning inference was computed from a specified model and input, on an untrusted platform, with no possibility of undetected tampering, unauthorized substitution, or computational shortcutting. The literature provides a variety of orthogonal approaches—cryptographic, hardware-assisted, economic, and statistical—reflecting diverse application demands ranging from cloud AI compliance to decentralized agentic services.

1. Definitions, Trust Models, and Security Requirements

Verifiable inference seeks strong guarantees that an output OO was correctly produced by applying a specific model MM to a specific input II, even when the computation is performed by an adversarial party. The formal requirements, as articulated in (Duddu et al., 2024, Alves et al., 30 Jan 2026), are:

  • Integrity: The (M,I,O)(M, I, O) triple cannot be changed or spoofed without detection.
  • Authenticity: Only a computation performed by an authorized, attested environment (e.g., TEE, cryptographic proof) can produce a valid certificate.
  • Binding: The cryptographic artifact binds output OO to input II and model MM.
  • Non-repudiation: The prover cannot later deny having produced OO for II.
  • Freshness: The output is not a replay attack (old or cached result masquerading as new).
  • Scalability: Each attestation or proof must be non-interactive and cheaply verifiable for many verifiers.

Threat models vary: some rely on secure hardware assumptions (e.g., trust in TEE attestation keys), others on strong cryptographic binding (e.g., Merkle tree commitments, zero-knowledge proofs), or on distributed rational adversarial models (e.g., optimistic re-execution with economic penalties).

2. Cryptographic and Hardware Mechanisms

Verifiable inference is realized through several design paradigms, each with distinct trade-offs.

A. Cryptographic Proofs of Inference:

  • Succinct arguments/zkSNARKs: Convert the inference computation into an arithmetic circuit or R1CS; the prover constructs a succinct (sometimes zero-knowledge) proof showing that (O,I,M)(O, I, M) satisfy the model computation (Wang, 25 Nov 2025, Gold et al., 23 Oct 2025). Advanced frameworks like JSTprove and ZK-DeepSeek employ recursive proof composition, lookup/LUT gadgets for nonlinearities, and modular circuit representations for scalability and input privacy.
  • Sampling-based proofs: Rather than proving correctness at every internal activation, these protocols (e.g., (Anchuri et al., 19 Mar 2026)) commit to full execution traces and open only a small random sample, relying on trace separation properties between correct and adversarial executions for statistical soundness.

B. Hardware-backed (Trusted Execution Environment) Attestations:

  • TEE-based Attestation: Compute the full inference within an SGX or TrustZone enclave, then generate an attestation (e.g., Intel SGX quote) over cryptographic hashes of the model, input, and output (Duddu et al., 2024, Kiri et al., 5 Jun 2026). The quote's authenticity and integrity are verifiable by any party with the TEE manufacturer's root certificates, yielding strong assurances under hardware trust.
  • Hybrid split execution: Partition computation between TEE and untrusted accelerators (e.g., VeriAttn (Chen et al., 15 Jun 2026)), with TEE performing probabilistic verification of GPU-calculated results (e.g., using Freivalds’ algorithm, log-product checks for exponentials).

C. Lightweight Proof-of-Activation/Fingerprint Protocols:

D. Economic and Optimistic Rollup Protocols:

  • EigenAI (Alves et al., 30 Jan 2026) combines deterministic inference on controlled hardware, cryptoeconomic optimizer rollup-style verification, threshold key release in TEEs, and slashing penalties, forming an incentive-compatible system that can enforce correctness through public challenge and replication.

3. Protocol Structures and End-to-End Workflows

The architectural foundations underpin the operational properties of verifiable inference protocols.

A. End-to-End Cryptographic Pipeline: (cf. JSTprove (Gold et al., 23 Oct 2025))

  1. Model is quantized and compiled into an arithmetic circuit.
  2. Prover runs inference and generates a witness (all intermediary values).
  3. Prover runs a ZK (or non-ZK) proof system (e.g., GKR, sumcheck, PLONK) over the circuit and witness, producing a succinct proof.
  4. Verifier checks the proof, which ensures that the claimed output MM0 is as computed from input MM1 per model MM2, and learns nothing else.

B. TEE-based Attestation: (cf. Laminator (Duddu et al., 2024), VeCoDI (Kiri et al., 5 Jun 2026))

  1. Inference code and model are loaded into an attested enclave.
  2. Input MM3 is loaded, output MM4 is produced.
  3. Enclave generates a signature/quote over hashes of MM5, possibly exposing only the output.
  4. Verifier checks quote integrity and identity.

C. Commitment and Sampling Protocols: (cf. TensorCommitments (Baser et al., 13 Feb 2026, Anchuri et al., 19 Mar 2026))

  1. Prover produces a cryptographically binding Merkle (or polynomial) commitment to activations or traces.
  2. Verifier samples a (possibly adversarially chosen) path or subset of activations and requests the openings, checking algebraic consistency with outputs.

D. Optimistic/Economic Enforcement: (cf. EigenAI (Alves et al., 30 Jan 2026))

  1. Operator submits a signed/encrypted log of deterministic inference and receipt to a public data availability layer.
  2. During a challenge window, any party may request re-execution via EigenVerify, which deterministically reruns the inferencing using the same hardware/seed stack in a TEE.
  3. Economic penalties (slashing) enforce honest execution; determinism guarantees byte-level comparison.

4. Performance, Empirical Evaluation, and Trade-offs

The spectrum of methods involves explicit trade-offs across trust assumptions, computational overhead, proof size, and detectability.

Approach Prover Overhead (vs. Inference) Verifier Overhead Proof/Attestation Size Security Assumption Empirical Success Rates
zkSNARK (ZK-DeepSeek) 10,000× (per inferences) ~350 ms (fixed-size) 32 KB (constant) DLOG, circuit soundness Zero knowledge, cryptographic soundness
JSTprove/Expander 10–1,000 s (varies) O(log n) ~0.2 MB Sumcheck, commitment soundness Completeness/zero-knowledge
TEE Attestation 21–71 ms/inference + 6 ms attn <10 ms ~2 KB TEE trust, attestation key Hardware-bound, scalable
TensorCommitments 0.97% 0.12% (CPU only) 2 B/token DLOG, trapdoor srs 96% attack detect., 0.12% verifier
TOPLOC Negligible 100× faster than inf. 258 B/32 tokens Polynomial hash, model/trust 100% emp. detection accuracy
SVIP <0.01 s/query <0.01 s/query 4 KB/query Proxy-task, secret TTP FNR<4.5%, FPR<2.5%
DiFR (Token/Activation) Zero <2 ms/1k tokens 0–16 B/token Seed sync, replayable sampling AUC>0.999 for quantization/bugs
EigenAI 1.8% (+determinism/perf.) <1% (audits), 1×–2× (challenge) Encrypted log Determinism, economic security Bit-exact, economic slashing

Deterministic engines (EigenAI) achieve full bit-exact repeatability with minimal overhead, at the cost of hardware homogeneity and a fixed software stack. TEE-based approaches such as Laminator or VeCoDI achieve low proof or attestation size but depend on trust in enclave manufacturers and correct attestation verification at large scale. Lightweight cryptographic commitment systems such as TensorCommitments or DiFR allow vastly reduced bandwidth and CPU costs, suitable for decentralized deployment, but their soundness is predicated on activation separation and may not achieve full cryptographic completeness.

5. Practical Applications and Deployment Contexts

Verifiable inference protocols have been deployed or proposed for a range of applications, each leveraging the distinct properties of the available proof or attestation method:

  • Sovereign agents (EigenAI): On-chain event resolution, prediction-market judges, scientific or regulatory autonomous agents that require public, unforgeable audit trails (Alves et al., 30 Jan 2026).
  • Model compliance and regulatory audit: Hardware-backed attestations for ML property and inference cards enable integrity certification in finance, healthcare, and enterprise AI governance (Duddu et al., 2024).
  • Open LLM APIs and decentralized inference: Lightweight proof systems and economic enforcement facilitate trustless hosting of large models in decentralized or permissionless compute ecosystems (VeriLLM (Wang et al., 29 Sep 2025)).
  • Privacy-preserving inference: SNARK+HE hybrid schemes (vPIN (Riasi et al., 2024)) and efficient masking protocols allow confidential data to be processed with formal proof of correct inference, applicable in MLaaS, federated learning, and sensitive domains.
  • Low-resource and edge compliance: Minimal SW TCB approaches (VeCoDI (Kiri et al., 5 Jun 2026)) enable efficient, confidential, and verifiable AI execution on constrained IoT devices, important for embedded inference with strong attestation needs.

6. Limitations and Future Research Directions

Each verifiable inference approach faces distinct limitations:

  • Cryptographic proofs (ZK, SNARK, plookup): Remain expensive for billion-param LLMs, though advances in parallelization, lookup-centric proofs, and sumcheck/GKR protocols are reducing costs (Gold et al., 23 Oct 2025, Benno et al., 19 Feb 2026).
  • TEE dependence: TEE approaches are vulnerable to side-channels, malware in the non-secure world, and hardware manufacturer trust assumptions (Duddu et al., 2024). Start-up costs and live-enclave management remain practical hurdles, though amortization can mitigate latency.
  • Lightweight fingerprints: Top-K/activation-based proofs are extremely efficient but may be vulnerable to adversarial manipulation in degenerate cases, depending on activation separation and model architecture (Baser et al., 13 Feb 2026, Karvonen et al., 25 Nov 2025).
  • Privacy/zero-knowledge: Most deployed SNARK-based systems operate only over quantized (fixed-point) circuits and struggle with floating-point fidelity, limiting their ability to support production models (Gold et al., 23 Oct 2025, Wang, 25 Nov 2025).
  • Generic zero-knowledge for LLMs: Remains infeasible for frontier-scale autoregressive models; hybrid protocols and statistical/fingerprinting approaches offer best-available compromises while GPU-accelerated proof systems are under active research.

Research frontiers include development of GPU-native provers, range-efficient lookup/commitment protocols (e.g., Jolt Atlas streaming (Benno et al., 19 Feb 2026)), partial probabilistic verification for LLM chains (Fisher guidance (Wang, 17 Mar 2026)), composable/proxy-task attestation for open-source models (SVIP (Sun et al., 2024)), and hybrid cryptoeconomic or multi-prover protocols for decentralized networks (Wang et al., 29 Sep 2025, Alves et al., 30 Jan 2026). Further work is necessary on post-quantum primitives and in extending proof frameworks to privacy-preserving training, continual learning, and edge deployment scenarios.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Verifiable Inference.