FHE-Agent: Automated FHE Configuration
- FHE-Agent is an LLM-driven framework that automates CKKS-based Fully Homomorphic Encryption configuration for secure, outsourced machine learning inference.
- It integrates a deterministic FHE toolchain with a hierarchical agent system to optimize global regimes and per-layer parameters while ensuring strict 128-bit security.
- Empirical results demonstrate enhanced precision and reduced latency, enabling feasible deployment for challenging models like AlexNet where traditional methods fail.
Fully Homomorphic Encryption (FHE) provides a cryptographic mechanism for computation on encrypted data without decryption, enabling secure outsourced inference in privacy-preserving machine learning as a service (MLaaS). However, configuring practical FHE inference using state-of-the-art schemes (notably CKKS) presents steep barriers due to the high-dimensional, coupled parameter space and the need for expert-level cryptographic reasoning. "FHE-Agent" denotes a class of LLM-driven agentic frameworks that automate this expert configuration process. By coupling deterministic FHE toolchains with a hierarchy of LLM agents, FHE-Agent systems systematically search and optimize FHE configuration, eliminating the guess-and-check loop and enabling practical deployment for deep neural networks, even where heuristic- and prompt-based approaches fail (Xu et al., 23 Nov 2025).
1. Motivation and Problem Landscape
Deployment of CKKS-based FHE, central for approximate arithmetic over encrypted vectors, is stymied by complex configuration requirements. Each setup must jointly determine ring dimension (), modulus chain (), packing layout, scale schedule, and bootstrapping placement to balance depth support, security (typically ≥128 bits), throughput, and runtime. Existing compilers (e.g., Orion, CHET, FHeliPe) provide sophisticated static signals, but selecting feasible, efficient parameterizations has remained a manual process highly sensitive to network depth, layer type, and packing strategy. One-shot LLM or heuristic searches routinely over-provision resources (resulting in excessive latency) or fail to yield feasible configurations, especially for nontrivial models such as AlexNet. FHE-Agent frameworks address this by decomposing configuration into globally and locally guided repair workflows, pruning infeasible regimes early, and reserving encrypted computation for credible candidates (Xu et al., 23 Nov 2025).
2. Overall Architecture and Agents
FHE-Agent consists of a deterministic, backend-agnostic FHE tool suite and a multi-agent LLM controller.
- Tool Suite Components:
- StaticAnalyzer: Evaluates multiplicative depth versus modulus chain length, ensuring compliance with security level constraints ( derived from ).
- LayerProfiler: Conducts slot-accurate, cleartext simulations (CLEAR_ONLY) to collect, for each layer , shape (), performance primitives (), and numeric diagnostics (, including approximation error and noise margins).
- BootstrapScheduler: Annotates where bootstrapping is required and marks "depth-critical" layers incapable of tolerating added multiplicative depth without blowing noise budgets.
- CostModel: Models layerwise runtime as , with coefficients initialized and calibrated by microbenchmarks and encrypted test runs.
- EncryptedEvaluator: Supports across four evaluation fidelities: STATIC_ONLY, CLEAR_ONLY, FHE_LIGHT, and FHE_FULL.
- LLM Controller (Agent Hierarchy):
- InitAgent/RegimeAgent: Given plaintext computation graph and global constraints, proposes a family of global regimes via variations in (, ), packing, and scale schedule.
- GlobalTradeoffAgent: Suggests global patches (chain tailoring, bootstrap insertion, scale schedule adjustments) to a selected baseline.
- LayerwiseAgent: Identifies bottleneck layers (using a score function: ; : slot utilization, : low noise flag) and proposes local packing or degree adjustments.
- PatchGateAgent: Screens candidate patches under STATIC_ONLY and CLEAR_ONLY, admitting only those passing security and precision thresholds for further encrypted evaluation.
3. Hierarchical Search and Configuration Decomposition
FHE-Agent models configuration as a two-level object:
- Global Regime:
- Per-Layer Local Parameters:
Ring dimension is governed by the required total multiplicative depth , with . Modulus chain length is set to accommodate the largest scale plus noise headroom, with . Packing layouts mediate the trade-off between parallelism and the cost of rotations and key switching.
The search proceeds from global exploration (limit candidate regimes via STATIC_ONLY/CLEAR_ONLY pruning) to localized, layerwise patching (up to per iteration, respecting bootstrapping and depth masks).
4. Multi-Fidelity Optimization Workflow
FHE-Agent introduces a structured three-phase search:
- Phase A (Structure Search):
- Evaluate candidate regimes using STATIC_ONLY and CLEAR_ONLY; prune those failing depth, security, or MAE/precision criteria.
- Rank survivals via proxy latency (CostModel); retain 1–2 as baselines.
- Phase B (Calibration):
- Perform FHE_LIGHT runs on validation subsets. Use results to regress for CostModel calibration.
- Select the best-performing regime ().
- Phase C (Admitted Refinement):
- Iteratively generate and evaluate agent-suggested patches (GlobalTradeoffAgent → LayerwiseAgent), subject to gating by STATIC_ONLY and CLEAR_ONLY.
- Permit only the top-ranked patch to proceed to FHE_LIGHT per iteration, maintaining an encrypted evaluation budget (typically light runs).
- Commit to FHE_FULL only for final verification.
This multi-fidelity pruning ensures that almost all infeasible or suboptimal designs are eliminated at minimal cost, reserving expensive encrypted inference for credible candidates.
5. Integration with Orion and CKKS Toolchains
FHE-Agent operates as an orchestration framework above the Orion compiler and Lattigo CKKS libraries. Its tool suite invokes Orion’s internal passes and Lattigo’s backend primitives for both static-depth/security validation and cleartext slot-accurate simulations, as well as full encrypted trials. The LLM controller generates only high-level, semantically validated configuration "patches," which are rendered as modifications against Orion’s JSON configuration files. The toolchain's shared semantic model (Orion IR and Lattigo's CKKS) eliminates simulation/execution discrepancies.
6. Empirical Results and Comparative Performance
Under a strict 128-bit security threshold, FHE-Agent demonstrates substantial improvements over both one-shot LLM and heuristic-based approaches. The results below highlight key metrics across standard benchmarks:
| Model | Precision (Naive LLM / Agent) | MAE (Naive / Agent) | FHE time/sec (Naive / Agent) | SecBits |
|---|---|---|---|---|
| MLP | 17.37 / 24.82 | 5.9e-6 / ≈0 | 1.31 / 0.91 | ≥128 |
| LeNet | 23.07 / 22.42 | 1.13e-7 / 1.79e-7 | 9.08 / 3.19 | ≥128 |
| LoLA | 19.54 / 21.16 | 1.31e-6 / ≈0 | 2.10 / 0.79 | ≥128 |
| AlexNet | no feasible / 21.81 | — / ≈0 | — / 262.5 | ≥128 |
FHE-Agent not only improves precision (by 5–7 bits for MLPs and CNNs) and reduces inference latency by factors of 2–4×, but crucially, automatically discovers a feasible 128-bit secure configuration for AlexNet—a model where baseline one-shot and heuristic methods fail entirely (Xu et al., 23 Nov 2025).
7. Design Principles Enabling Efficient High-Dimensional Search
The success of FHE-Agent is attributable to the following factors:
- Hierarchical Decomposition: Global regime selection and local repair prevent combinatorial explosion, isolating search to global and local candidates per iteration.
- Multi-Fidelity Pruning: Early elimination via STATIC_ONLY and CLEAR_ONLY ensures near-zero cost discarding of infeasible configurations.
- Discrete, Validated patch directions: Only semantically approved parameter changes (e.g., chain shortening, packing modification, activation degree reduction) are permitted, minimizing the risk of invalid states.
- Enforced Constraints: All modifications are gated by static analysis and bootstrapping masks, ensuring depth and security invariants are never violated.
In aggregate, these design features enable FHE-Agent to navigate the nonlinear, tightly coupled CKKS configuration space automatically, reliably emulating (and frequently surpassing) human expert configurations while efficiently using the expensive encrypted evaluation budget (Xu et al., 23 Nov 2025).