- The paper introduces WaterAdmin, a bi-level AI framework that combines LLM-driven context abstraction with ML optimization for dynamic water distribution control.
- It leverages hierarchical prompting and simulation-in-the-loop training to robustly forecast demand and set actionable operational targets.
- Empirical results on the EPANET NET3 benchmark show over 36% pressure MSE reduction and 71.2% energy savings, proving its scalability and real-world applicability.
Framework Overview and System Architecture
The paper introduces WaterAdmin, a bi-level AI-agent-based framework designed for community water distribution network (WDN) operations. WaterAdmin jointly harnesses LLMs for high-level abstraction of heterogeneous community and environmental context, while delegating real-time operational control to a ML optimizer. The framework resolves the inherent limitations of traditional optimization and LLM-centric solutions in accommodating dynamic, unstructured contextual factors influencing water demand, such as human activity, weather, and regional instructions.
At the upper level, LLM agents, guided by chain-of-thought (CoT) and in-context prompting, abstract unstructured community context—including event schedules, weather variations, and human operational directives—into structured operational targets. These targets consist of multi-horizon water demand forecasts, zone-specific pressure ranges, and nominal tank levels. Discretization of continuous quantities (e.g., forecasted demand levels) enables robust, qualitative abstraction by the LLM, circumventing their weakness in generating precise numerical data.
At the lower level, a policy ML model receives both these high-level targets and current WDN state (pressures, flows, tank levels), and directly outputs actionable pump speeds and valve positions. Training is achieved via zeroth-order optimization with simulation episodes in EPANET. This approach enables the optimizer to learn from high-fidelity simulator trajectories in absence of differentiability, with penalties for violations of operational constraints.
Figure 1: WaterAdmin system architecture highlighting the bi-level decomposition into LLM-powered contextual abstraction and ML-driven operational optimization.
The underlying control objective is multi-faceted: satisfy stochastic spatio-temporal demand at all nodes, maintain nodal pressures within prescribed operational ranges, and minimize total energy consumption (with allowance for time-of-use pricing). The discrete network structure is modeled as a directed graph G=(V,E); controls include variable-speed pump operations and actuator-driven valve positions.
WaterAdmin's core methodological innovations are:
Prompt Engineering and LLM Integration
To maximize informativeness and generalizability of operational targets produced by the LLMs, the authors design hierarchical prompting templates that encode discrete event semantics, building typology, and temporal information. Discretized demand levels (log-scaled) and semantic rules allow LLMs to output usage rankings and pressure targets based on historical precedent with explicit justifications—a critical element for deployment in eventful community settings. The LangChain framework manages prompt persistence and retrieval.
Figure 3: Template for generation of event descriptions, enabling systematic contextualization for LLM agents.
Figure 4: Water demand forecasting prompt structure provided to the LLM, guaranteeing adherence to classification rules and structured output.
Evaluations are conducted on the EPANET NET3 benchmark (97 nodes, variable source/tank topology) with real demand data mapped to zoning and category-specific nodes. Three controller paradigms are benchmarked: classic simulator rule-based logic, pure ML-based optimization without LLM abstraction, and the full WaterAdmin stack (with varying prediction horizons).
Key results:
- Pressure and reliability: WaterAdmin6​ (6-hour LLM horizon) reduces normalized pressure MSE by over 36% compared to ML-only, and drastically lowers pressure violations (Max Viol. 7.57%, Min Viol. 14.95%) against both baselines.
- Energy efficiency: Energy consumption per hour is reduced by 71.2% (WaterAdmin6​ vs. ML-only), far outperforming rule-based control (0.85 MWh/h vs. 30.01 MWh/h).
- Prediction window sensitivity: Extending the LLM-informed target window yields monotonic improvements in stabilization and efficiency. Shorter horizons degrade performance due to lower informational lookahead.
- LLM ablation: Substituting state-of-the-art LLMs (ChatGPT-4o, Gemini-3, DeepSeek-V3) reveals negligible performance differences, reflecting the bottleneck introduced by the discretized abstraction and the effectiveness of in-context prompt engineering, rather than raw LLM scale.


Figure 5: (a) 24-hour pressure trace for Node 113 comparing control algorithms. (b) Pressure distribution across methods. (c) Energy consumption distributions; longer LLM prediction windows induce lower deviation and energy use.
Theoretical and Practical Implications
Formally decoupling context summarization and operational optimization with AI agents enables robust handling of highly stochastic, nonstationary environments characteristic of future smart cities. This architecture supports integration of auxiliary data modalities (weather, traffic, smart grid signals) and facilitates explainable, auditable adaptation to emergent community events—advantages unattainable with black-box ML or static optimization. By embedding simulation-in-the-loop learning, WaterAdmin is positioned for direct transfer to digital twin-based precinct management.
From a practical deployment perspective, the strong numerical results—substantially reducing energy and pressure risks without risking violation of hard constraints—suggest immediate applicability for urban utilities seeking resilient, adaptive infrastructure management under increasing demand variability and decarbonization mandates.
Looking forward, research directions include: tighter integration with fully continuous demand prediction, multi-agent hierarchical coordination (interacting AIs at district and precinct levels), and enhanced robustness to adversarial or anomalous context signals.
Conclusion
WaterAdmin establishes an extensible, bi-level AI-agent framework for context-driven optimization of community water distribution networks, combining the contextual reasoning capacity of LLMs with operational resilience of ML optimizers. Comprehensive simulations in EPANET validate substantial improvements in reliability and energy efficiency relative to both classic and naive ML baselines, with demonstrated scalability across LLM backbones and context windows. This design paradigm opens new pathways for robust, explainable, and adaptable management of complex infrastructure systems in dynamic, heterogeneous community environments.