TeraAgent: Extreme-Scale Simulation Engine
- TeraAgent is a distributed agent-based simulation engine designed to model extreme-scale systems by simulating up to 501.51 billion agents.
- It employs advanced domain decomposition, nonblocking MPI communication, and custom serialization, achieving up to 110× median speedup in agent state transfers.
- The engine supports modular extensions and diverse applications in biomedicine, epidemiology, and urban simulations, while integrating efficiently with visualization pipelines.
TeraAgent is a modern, distributed agent-based simulation engine purpose-built to support extreme-scale simulations, enabling the modeling and analysis of complex systems that require simulating up to half a trillion agents on large-scale computing infrastructures. The platform overcomes key architectural and performance bottlenecks present in traditional shared-memory solutions by introducing novel mechanisms for efficient inter-node communication, agent migration, and memory management. TeraAgent’s design supports modular extensibility, high interoperability, and flexible utilization of computational resources, positioning it as a key tool for diverse applications in biomedicine, epidemiology, neuroscience, and policy modeling.
1. Rationale and Evolution
TeraAgent addresses the limitations inherent to established agent-based modeling platforms such as BioDynaMo, which previously scaled to billions of agents using shared-memory parallelism (OpenMP) but could not efficiently distribute simulation workloads across multiple nodes. The shift to a distributed, scale-out architecture in TeraAgent enables simulations to transcend single-server computational and memory constraints, supporting monolithic models on clusters, supercomputers, or hybrid environments. The resulting engine allows for unprecedented model sizes (up to 501.51 billion agents across 84,096 CPU cores, an 84× improvement over legacy approaches) and improved time-to-solution (Breitwieser et al., 28 Sep 2025, Breitwieser, 13 Mar 2025).
2. Distributed Architecture and Execution Model
2.1 Spatial Domain Decomposition
TeraAgent partitions the simulation space into disjoint sub-volumes (partitioning boxes) via a user-configurable grid. Each partition is assigned to an MPI rank, which is "authoritative" over agents inside its sub-volume. Boundary regions (aura/halo) are constructed to enable local agent interactions spanning partition borders.
2.2 Aura Updates and Agent Migration
To maintain neighborhood consistency for agents near partition boundaries, TeraAgent implements aura updates: at each simulation step, copies of boundary agents ("ghosts") are exchanged between adjacent ranks. If an agent moves between partitions, the system automatically detects this change and transfers the agent to the correct rank, modifying simulation state to preserve global coherency.
2.3 Flexible Execution Strategies
Execution modes include:
- MPI-only: One rank per core, beneficial for large numbers of cores with low memory per rank.
- MPI+OpenMP hybrid: One rank per NUMA domain, using threading for intra-node parallelism.
Both paradigms are transparent to the user model code, allowing agents and logic to be distributed—or re-run in shared-memory mode—without code refactoring (Breitwieser, 13 Mar 2025).
3. Communication Optimizations and Serialization
3.1 Purpose-Built Serialization ("TeraAgent IO")
Generic serialization approaches (e.g., ROOT I/O) incur significant overhead from pointer deduplication and schema evolution, which are unnecessary for agent objects in TeraAgent. The dedicated TeraAgent IO system exploits the absence of pointer sharing, using in-order traversal of agent memory, replacing vtable pointers with identifiers, and enabling agent state to be mutated directly within received buffers. Reported performance enhancements reach up to 296× in serialization and 73× in deserialization, with typical median speedups of 110× and 37×, respectively (Breitwieser et al., 28 Sep 2025).
3.2 Delta Encoding and Compression
To minimize inter-node communication, TeraAgent employs delta encoding. At every timestep, only the differences (deltas) between the current and previous agent states are computed, reordered, and transmitted. On the receiver, deltas are applied to stored references to reconstruct updated states. Compression (LZ4 or similar) is applied after delta encoding, reducing message size by up to 3.5× in large-scale runs. This substantially reduces bandwidth usage for agent migrations and aura updates.
3.3 Nonblocking Communication
TeraAgent relies on nonblocking point-to-point MPI calls (e.g., MPI_Isend, MPI_Irecv), overlapping computation and communication. This design, in conjunction with aggressive batching and speculative receives, exploits modern high-throughput interconnects to further mask network latencies and improve wall-clock performance.
4. Scalability and Performance
TeraAgent achieves strong and very weak scaling to thousands of compute nodes:
- Agent update throughput is proportional to the number of nodes, owing to efficient local memory access and minimal required communication.
- Reported experiments demonstrate agent update rates per CPU core exceeding those of prior state-of-the-art (e.g., 8× more efficient than Biocellion).
- TeraAgent completed simulations involving over 500 billion agents and 84,096 CPU cores.
- The engine’s approach to minimal data transfer and high serialization throughput is a key determinant of scaling efficiency, ensuring that network and serialization overheads do not dominate (Breitwieser et al., 28 Sep 2025).
5. Application Domains and Integrations
TeraAgent supports a broad range of agent-based modeling scenarios, including:
- Biomedical simulations: Cell clustering, proliferation, cortical modeling (with single neuron granularity over large spans).
- Epidemiology: Simulation of epidemic spread (e.g., SIR models) over populations on the scale of millions to billions of agents.
- Oncology and tumor growth: Detailed, spatially resolved modeling of angiogenesis, tumor-stroma interactions, and large-scale cell biology.
- Urban and policy simulations: High-resolution modeling of city-scale social, economic, or transportation phenomena.
- Radiotherapy simulation: Its predecessor BioDynaMo, with TeraAgent’s distributed extensions, underpinned the radiotherapy-induced lung injury simulation named as a top ten physics breakthrough in 2024 (Breitwieser, 13 Mar 2025).
TeraAgent’s modularity supports seamless integration with visualization and analysis pipelines (e.g., ParaView), enabling in-situ, scalable post-processing (up to 39× faster than previous methods) even during ongoing distributed simulations (Breitwieser et al., 28 Sep 2025).
6. Interoperability, Modularity, and Extensibility
TeraAgent is engineered for minimal disruption to existing codebases written for BioDynaMo. Most agent and model codes require only changes to configuration files or minor macro additions to operate in distributed mode. Key extensibility features include:
- Modular plug-in architecture for serialization and delta-encoding.
- Multiple distribution strategies (MPI-only, hybrid, variable load-balancing approaches).
- Well-defined interfaces for domain-specific agent types, external visualization, and additional modalities.
- Flexibility for future extensions (e.g., improved load balancing, fault-tolerance, energy-awareness, code generation for serialization of user-defined agents) (Breitwieser et al., 28 Sep 2025).
This adaptability allows researchers to simulate new types of dynamic systems, integrate third-party libraries, or embed TeraAgent within multi-modal simulations (e.g., coupling with Monte Carlo methods).
7. Future Directions
Potential future research and development areas highlighted include:
- Automation of code generation for serialization routines.
- Adaptive delta encoding and advanced compression to further reduce communication volumes depending on system interconnect.
- Enhanced load balancing strategies (including global and diffusive) for more heterogeneous and dynamic scenarios.
- Improved integration with scientific pipelines for real-time analysis, visualization, or hybrid simulations.
- Advancements in distributed fault-tolerance and energy efficiency for operation in cloud and exascale supercomputing environments (Breitwieser et al., 28 Sep 2025).
Summary Table: Key Features and Performance Metrics
| Aspect | Description | Performance/Scale Example |
|---|---|---|
| Architecture | Distributed, MPI(+OpenMP), partitioned simulation grid | Up to 84,096 CPU cores |
| Serialization | Tailored in-order, zero pointer deduplication, direct update | 296× (max), 110× (median) speedup |
| Data Transfer Optimizations | Delta encoding, LZ4 compression | 3.5× reduction in message size |
| Strong/Weak Scaling | Near-ideal to thousands of nodes, minimal coordination required | 501.51 billion agents, 84× improvement |
| Application Domains | Biomedical, epidemiology, oncology, urban, radiotherapy | Radiotherapy simulation: PhysicsWorld top 10, 2024 |
TeraAgent thus provides a robust, high-performance, and extensible platform for extreme-scale agent-based simulations, effectively enabling new classes of scientific inquiry previously restricted by hardware and architectural bottlenecks (Breitwieser et al., 28 Sep 2025, Breitwieser, 13 Mar 2025).