Exascale Computing Project Overview
- Exascale Computing is a US DOE initiative that develops high-performance systems delivering at least 10^18 FLOPS for transformative scientific research.
- The project integrates advanced application codes, software libraries, and hardware technologies through co-design to achieve significant performance and energy efficiency gains.
- Key achievements include meeting milestone-driven targets, establishing a robust open-source ecosystem, and demonstrating up to 100× improvements in prototype performance.
Exascale Computing Project
The Exascale Computing Project (ECP) is a United States Department of Energy (DOE) initiative to realize a capable ecosystem of exascale computing—high-performance computing systems and software delivering at least floating-point operations per second (FLOPS)—with sustained application performance, energy efficiency, programmability, and scientific impact. ECP coordinates integrated investments in application codes, software libraries and tools, mathematical algorithms, and hardware technologies to enable transformational advances in simulation, data analysis, and machine learning across scientific domains.
1. Mission, Organization, and Scope
ECP is tasked with delivering exascale systems robustly supporting DOE’s science and security missions while achieving energy consumption, resilience, and software sustainability targets. The project was launched in October 2016 as a joint effort between DOE’s Office of Science (DOE-SC) and National Nuclear Security Administration (NNSA) (Brower et al., 2017). The ECP’s mission mandates a minimum 50× performance improvement for priority scientific applications and end-to-end workflows relative to previous 20-petaflop “leadership”-class supercomputers, with a total power envelope in the 20–30 MW range (Brower et al., 2017). The organizational structure involves DOE national laboratories (Oak Ridge, Argonne, Lawrence Berkeley, Los Alamos, Lawrence Livermore, Sandia), numerous academic partners, and around 30 industry hardware vendors (Brower et al., 2017).
The project is governed with a construction-project management approach, utilizing an Earned-Value Management (EVM) system to track scope, schedule, and cost, with progress assessed through discrete technical milestones (“Planned Value,” “Earned Value,” “Actual Cost”) and Key Performance Parameters (KPPs) (Heroux, 2023). KPP-3, for example, requires demonstrated integration of deliverables into multiple real application codes, not just stand-alone libraries or tools. The ECP delivered over 1,700 milestones and almost 300 documented integrations into production codes by project closeout in December 2023 (Heroux, 2023).
ECP’s activities are organized into three technical thrusts: Applications, Software Technology (including math libraries, I/O, performance tools), and Hardware Technology (processor architecture, interconnect, memory hierarchy, resilience) (Brower et al., 2017, Heroux, 2023). Cross-disciplinary “co-design centers”—such as CEED for discretizations and CoPA for particle applications—bridge these areas to accelerate domain-specific algorithmic innovation (Kolev et al., 2021, Mniszewski et al., 2021). Product teams are clustered into thematic Software Development Kit (SDK) teams to promote collaboration, and all deliverables are curated and distributed via the Extreme-scale Scientific Software Stack (E4S) (Heroux, 2023).
2. Software Ecosystem and Quality Assurance
ECP has resulted in a multi-layered, open-source software ecosystem with more than 70 portable, GPU-capable libraries and performance tools distributed through E4S (Heroux, 2023). This ecosystem includes performance-portable programming models (Kokkos, RAJA), communication runtimes (MPI, UPC++, Legion), math libraries (PETSc, Trilinos, MAGMA, libCEED), I/O and data management (HDF5, ADIOS2, openPMD), in situ analytics, and domain-specific frameworks (Abdelfattah et al., 2021, Kolev et al., 2021, Mills et al., 12 Jun 2024). E4S is distributed as a Spack meta-collection and as curated container images (Docker, Singularity), with rigorous reproducibility and cross-platform compatibility (Heroux, 2023).
All ECP software products adhere to the E4S Community Policies, covering mechanisms like mandatory unit/regression tests, coding standards, configuration and packaging guidelines, and formal support channels. Automated continuous integration (CI) systems ensure cross-platform correctness and performance regression detection, with targets of ≥80% code coverage (Heroux, 2023). Quarterly E4S releases are validated through system-level smoke tests and provide binary caches and container support for scalable installation. Software teams benefit from a competitive yet collaborative model (“co-opetition”) at the SDK layer, with benchmarks and bake-offs used to converge on best-in-class implementations (Heroux, 2023).
3. Co-Design and Performance Portability
A central ECP innovation is “co-design”—iterative, bidirectional interaction among application scientists, algorithm developers, software engineers, and hardware vendors to ensure all layers of the stack are optimized concurrently (Mniszewski et al., 2021, Brower et al., 2017, Kolev et al., 2021). This process is realized in multiple co-design centers (e.g., CEED, CoPA), which extract crosscutting motifs (high-order PDEs, particle simulations) and develop both proxy applications and reusable libraries. These proxies isolate essential computational kernels, expose algorithmic “knobs” (e.g., SIMD block sizes, interpolation order), and drive both hardware and software feature requirements (Mniszewski et al., 2021).
Performance portability is a core requirement: all applications and libraries must efficiently utilize both CPU and GPU architectures from multiple vendors (NVIDIA, AMD, Intel) (Abdelfattah et al., 2021, Huebl et al., 2023, Mills et al., 12 Jun 2024). This is achieved by leveraging model-agnostic abstraction layers (Kokkos, SYCL/DPC++, OCCA, OpenMP 5 offload), back-end plugins, and device-aware memory management. Fine-grained asynchronicity (overlapping communication and computation), GPU-native data layouts (structure-of-arrays, block interleaving), and advanced scheduling (PetscDeviceContext, NVSHMEM collectives) are utilized in core libraries such as PETSc/TAO and CEED (Mills et al., 12 Jun 2024, Kolev et al., 2021).
4. Major Application Codes and Algorithmic Advances
ECP supports a suite of flagship applications across domains, catalyzing architectural and numerical innovation. Examples include:
- WarpX: Application code for staged laser-plasma wakefield acceleration, originally mandated to achieve a 50× speedup over pre-ECP, ultimately exceeded by a 500× quantum leap. WarpX is built atop AMReX (block-structured adaptive mesh refinement) and PICSAR (performance-portable PIC kernels), and is fully portable across CPUs and GPUs with strong scaling demonstrated to ranks. It was the first laser–plasma code to run at full scale on exascale hardware (Frontier, Oak Ridge), sustaining particle-and-cell updates per second (Huebl et al., 2023, Vay et al., 2018).
- Lattice QCD: End-to-end workflows for precision quantum chromodynamics calculations, requiring exascale resources to reach physical quark masses on fine lattices. ECP-sponsored algorithmic developments include communication-avoiding Krylov solvers, multigrid for staggered fermions, and all-mode-averaging, with a strong emphasis on portable APIs (Grid, QEX) and efficient use of both CPUs and GPUs (Brower et al., 2017).
- CEED Applications: High-order finite element and spectral element methods, with extensive use of sum-factorization and matrix-free operator evaluation for increased arithmetic intensity on GPUs. CEED libraries (libCEED, MFEM, MAGMA, NekRS) use fused kernel strategies, just-in-time code generation, and memory-coherent data handling, achieving near-peak GFLOP/s on V100/MI100/A100 devices, and have been broadly deployed in application codes such as E3SM, WarpX, and ExaSMR (Abdelfattah et al., 2021, Kolev et al., 2021).
- Particle Motif Codes: Smoothed particle hydrodynamics (SPH-EXA), molecular dynamics (CabanaMD), PIC methods (CabanaPIC), and quantum molecular dynamics (ExaSP2/PROGRESS/BML) have been refactored for exascale execution, using custom octree or k-d tree domain decompositions with hybrid MPI+ backends (OpenMP, OpenACC, CUDA, Kokkos) (Mniszewski et al., 2021, Cavelan et al., 2020).
- Cosmology and In Situ Analytics: ArborX geometric search library, co-developed for exascale, accelerates in-situ DBSCAN-based halo finding in HACC simulations, delivering up to a 10× reduction in stepper cost and supporting real-time, high-fidelity merger tree analysis (Prokopenko et al., 16 Sep 2024).
- General Relativistic MHD and AMR: GRaM-X extends Einstein Toolkit to GPU-accelerated exascale systems; uses AMReX-based AMR, Z4c, and Valencia GRMHD, with weak scaling efficiency of 40–50% at >13,000 V100 GPUs (Shankar et al., 2022).
5. Hardware–Software Integration and Exascale Platform Readiness
ECP drove hardware–software co-design efforts directly into the architecture of flagship exascale platforms, notably Oak Ridge’s Frontier and Argonne’s Aurora (Allen et al., 10 Sep 2025). Architectural features such as high bandwidth memory (HBM2e), high-radix dragonfly interconnects (HPE Slingshot-11), GPU-to-GPU coherence (Xe-Link, NVLink), energy-aware scheduling (GEOPM), and exascale storage layers (DAOS) were made tractable and productive by ECP’s early engagement with hardware vendors (Brower et al., 2017, Allen et al., 10 Sep 2025). Each node incorporates multiple x86_64 or ARMv8 CPUs, several high-throughput GPUs (e.g., AMD MI250X or Intel Ponte Vecchio), and HBM, providing aggregate system memory and bandwidth exceeding hundreds of petabytes and tens of petabytes/sec, respectively (Allen et al., 10 Sep 2025).
Software and applications were qualified on pre-production hardware, with ECP’s Application Development (AD) and Software Technology (ST) teams collaborating on scaling studies, performance validation, and readiness reviews (Allen et al., 10 Sep 2025). Key metrics include achieved exaflop/s rates on Linpack and application benchmarks, sustained weak/strong scaling at concurrent MPI ranks, and energy efficiency (e.g., 20–25 GFLOP/s per watt at cluster scale) (Goz et al., 2017, Allen et al., 10 Sep 2025).
A comprehensive, production-quality I/O stack (e.g., SAGE, DAOS) and open data standards (openPMD, HDF5, Clovis API) support in-situ analytics and hybrid BDEC (Big Data Extreme Computing) workflows (Narasimhamurthy et al., 2018, Allen et al., 10 Sep 2025).
6. Lessons Learned and Impact
ECP’s key project management insights include the viability of EVM/E4S milestone-driven oversight even in research-driven software efforts, and the value of balancing milestone completion (“doing things right”) with application integrations (“doing the right things”) (Heroux, 2023). Rigorous software quality assurance—encompassing automated CI, policy compliance, and sustainable integration—was essential to achieving up to 100× efficiency improvements on exascale prototypes and a sustainable ecosystem for ongoing post-ECP scientific computing work (Heroux, 2023).
The co-design approach, as illustrated in centers like CEED and CoPA, reduced code duplication, accelerated algorithmic and hardware tuning, and enabled rapid translation of research kernels to production environments, with demonstrated 3–22× speedups in key motifs (e.g., MD, QMD, FFT, FE kernels) (Mniszewski et al., 2021, Abdelfattah et al., 2021). ECP’s focus on portability ensured that the same ecosystem runs efficiently across architectures.
The project seeded open, modular, and standards-based software—application codes, libraries, and data engines—that underpins both DOE and broader community exascale science, encompassing machine learning, predictive simulation, and digital twins for feedback-driven, hybrid, virtual test stands (Huebl et al., 2023).
The ECP delivery model and technical outputs continue to influence large-scale software sustainability efforts, adoption of HPC/AI hybrid workflow stacks, and multi-institutional scientific governance, potentially serving as the blueprint for future exascale and post-exascale scientific initiatives (Heroux, 2023).