MOZAIK Architecture: Secure IoT Analytics

Updated 12 January 2026

MOZAIK architecture is a privacy-preserving IoT-to-cloud platform that encrypts data at the source for secure analytics.
It employs MPC and FHE to execute distributed computations with verifiable correctness and user-centric access control.
The design supports scalable streaming analytics and minimizes cloud trust by ensuring data remains encrypted across all stages.

MOZAIK is an end-to-end privacy-preserving analytics platform designed for confidential data storage and distributed query processing in IoT-to-cloud systems. The architecture guarantees that sensitive IoT data is encrypted before transmission, remains encrypted across storage, transport, and computation, and can be processed with cryptographic proofs of correctness using secure multi-party computation (MPC) or fully homomorphic encryption (FHE). Users retain granular control over data access and computation, enabling secure analysis without trusting cloud providers or sharing raw data with external parties (Kenhove et al., 5 Jan 2026).

1. System Goals and Threat Model

MOZAIK adopts a strong privacy posture centered on the following objectives:

End-to-end data confidentiality: Data is encrypted on IoT devices or trusted gateways, never exposed as plaintext after leaving the user environment.
Cryptographically secure analytics: Computation on encrypted data is realized by two COED (computing on encrypted data) technologies—MPC and FHE—ensuring provable security guarantees.
User-centric access control: Data owners specify computational consent for designated subsets, temporally or by query.
Cloud trust minimization: No single server is ever fully trusted; security holds even if portions of cloud infrastructure are compromised.

The threat model encompasses adversaries who can compromise central storage (MOZAIK-Obelisk), the underlying network, and either up to $t$ out of $n=3$ MPC parties (in the MPC scenario, with honest majority), or the single FHE server (in the FHE mode). Model providers are semi-honest, and adversaries may attempt to break confidentiality, correctness, or violate user consent policies.

2. Architectural Components and Data Flow

MOZAIK is orchestrated as a set of coordinated microservices and cryptographic compute engines. The main architectural modules include:

IoT Sensor Devices & Gateway: Produce raw sensor data (e.g., heartbeats), encrypt each measurement under a symmetric key $k$ with authenticated encryption before dispatch.
MOZAIK-Obelisk: Stateless API gateway and microservices for authenticated ingestion, streaming, querying, permission tracking, and key-share management. Components include Traefik (API gateway), Keycloak (auth), Kafka (ingest), ClickHouse (storage), Redis (metadata and key-shares), and orchestration on Kubernetes.
Compute Engines:
- MPC servers (three party, replicated secret sharing).
- FHE server (single node running CKKS SIMD encryption).
Model Provider: Supplies ML inference models under strict confidentiality.

Data Flow:

Encryption at Source: Each data point $d$ is encrypted as $_d^k=\mathsf{AEEnc}_k(d,N,\mathrm{user}\|N)$ with a nonce $N$ .
Ingestion/Staging: Encrypted payload sent via API, published to Kafka, persisted in ClickHouse.
Analysis Request: User (via web UI) requests analytics for a subset. For MPC, browser performs additive key share split $k = k_1 \oplus k_2 \oplus k_3$ , encrypts each share for the three parties, and the API deposits encrypted shares to Redis.
Compute Engine Activation: API notifies relevant compute engines to queue computation.
Distributed Secure Computation: Compute engines read encrypted dataset and keys, execute COED protocols, re-encrypt results, and persist to storage.
Result Delivery: User retrieves encrypted computation result and decrypts locally.

Streaming mode allows pre-authorized, continuous analytics over windowed, micro-batched data flows. This design decouples ingestion from compute latency using efficiently triggered microbatching.

3. Cryptographic Protocols

MOZAIK implements dual cryptographic paradigms:

A. Secure Multi-Party Computation (MPC)

Replicated Secret Sharing: Shares over $\mathbb{Z}_{2^\ell}$ or $\mathrm{GF}(2^n)$ , with $x = x_1 + x_2 + x_3\ (\bmod\ 2^\ell)$ or $x = x_1 \oplus x_2 \oplus x_3$ ; each party holds two shares.
Reconstruction: Any two shares suffice.
Beaver Multiplication: Preprocessed triple $\{a\}, \{b\}, \{c=ab\}$ used for secure online product: $\{z\} = d\{b\} + e\{a\} + \{c\} + de$ with $d = x-a, e = y-b$ .
Share Conversion: Arithmetic-to-Boolean and vice versa via replicated-carry adder.
Fixed-Point and Nonlinearity: Encodes real values, truncates after multiplication, and uses Chebyshev polynomial approximations for activation functions.

B. Fully Homomorphic Encryption (FHE - CKKS SIMD)

Key Generation: $(pk, sk, \text{eval keys})$ .
Encryption/Decryption: $Enc_{CKKS}(\vec{m}, pk) \rightarrow ct$ , $Dec(ct, sk) = \vec{m} + \vec{\epsilon}$ .
Homomorphic Operations: $ct_+ = Add(ct_0, ct_1)$ , $ct_\odot = Mul(ct_0, ct_1)$ , $Rot(ct, r)$ . Bootstrapping refreshes noise.
Polynomial Evaluation: Approximates nonlinearities $f(x)$ as $P(x) = \sum a_i x^i$ via Chebyshev nodes.

Core Algorithms (see Algorithms 1–4 in the source):

Distributed decryption and encryption for MPC mode.
Secure inference: iterative linear and activation layers, with secure fixed-point arithmetic.
FHE inference: batched matrix multiplication, bias addition, and polynomial activation approximation.

4. Deployment Modes, Security, and Trust

MOZAIK supports two operational modes with distinct trust assumptions:

MPC Mode: Three-party honest-majority deployment, malicious security up to $t<n/2$ . Users select their MPC servers and manage key-share distribution. Offline phase (triple generation) followed by online phase (input-dependent).
FHE Mode: Single semi-honest server receives ciphertext and evaluation keys, no inter-server traffic, and processes queries independently. Defenses against malformed ciphertexts and side-channel attacks are achievable with supplementary ZKPs.

Key distinctions: MPC offers lower latency and superior throughput for realistic batch sizes. FHE requires significant key material, non-trivial load time, but is more scalable in cloud settings due to absence of coordination overhead.

5. Implementation Details and Reproducibility

The proof-of-concept system integrates the following software stack:

Component	Technology Stack	Functionality
Microservices	Go, Kubernetes, Traefik, Keycloak, Kafka, ClickHouse	Ingestion, authentication, streaming, storage
MPC Engine	MP-SPDZ, MAESTRO extension (GF(2^128)), AES-GCM	Secure computation, Boolean MPC, share conversion
FHE Engine	OpenFHE (CKKS)	Batched FHE inference server
Frontend	React, WebCrypto API	Key share creation, browser decryption

Open-source repositories under the MOZAIK-SBO GitHub organization include: obelisk-api (API definitions), mpc-party (protocols and scripts), fhe-engine (server code), heartbeat-usecase (simulated data), and evaluation-bench (benchmarks, plots, results).

Reproduction instructions:

Clone all repositories, set up Kubernetes cluster.
Deploy all Obelisk HFS microservices.
Launch MPC Docker containers running MP-SPDZ/MAESTRO.
Deploy FHE engine with the necessary keys.
Configure user accounts and deploy frontend.
Simulate live IoT sensor data ingestion.
Execute analytic workflows and inspect encrypted logs/results.

6. Performance Characteristics and Trade-Offs

Empirical benchmarks compare MPC, FHE, and conventional plaintext inference engines:

For batch size 1, MPC (semi-honest) achieves ~0.082 s latency, MPC (malicious) ~0.153 s, FHE ~40 s.
Batch size 10,000: MPC (malicious) requires ~321 s, plaintext ~0.005 s (five orders of magnitude faster).
FHE demonstrates stable inference latency up to ciphertext capacity, but individual batches still require ~179 s.
End-to-end MPC satisfies latency targets for batch ≤240 (within 10 s constraint); FHE fails this target for moderate batch sizes.
Streaming ingest rates (batch 64): ~13.84 samples/sec, supporting 138 concurrent streams within latency bounds.

Trade-Offs:

MPC (malicious-secure) outpaces FHE by factors of 20–260× for practical analytics.
FHE requires substantial key material, slower CPU-bound execution, but enables simpler scaling.
MPC’s throughput is linear with respect to multiplication gates and bounded by network bandwidth.
Hybrid architectures—MPC engines also running FHE for low-priority tasks—may present viable extensions for specific workloads.

7. Open Source Distribution and Reproducibility

All core MOZAIK components are released open-source. The GitHub organization contains:

API and service definitions.
MPC and FHE compute engines.
Example data pipelines and user-facing tools.
Evaluation scripts for publishing raw benchmarks, figures, and reproducibility instructions.

Users and researchers can replicate the platform by deploying the full stack on standard cloud infrastructure, running both streaming and ad hoc analytic pipelines, and validating encrypted protocol correctness by comparing outputs across different cryptographic backends (Kenhove et al., 5 Jan 2026).

In summary, MOZAIK constitutes a rigorously engineered IoT-to-cloud architecture that enforces end-to-end cryptographic privacy, supports secure distributed analytics, and provides reproducible open-source tools for high-fidelity evaluation of privacy-preserving machine learning and data pipeline protocols.

PDF Markdown Chat (Pro)

References (1)

MOZAIK: A Privacy-Preserving Analytics Platform for IoT Data Using MPC and FHE (2026)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to MOZAIK Architecture.