Papers
Topics
Authors
Recent
2000 character limit reached

SeerX: Scalable In-Transit Analysis

Updated 26 October 2025
  • SeerX is a scalable, in-transit service for real-time simulation data analysis, offering elastic resource provisioning and both lossy and lossless compression.
  • It alleviates storage and synchronization bottlenecks in extreme-scale simulations by decoupling computation from interactive post hoc visualization.
  • Its modular design integrates key frameworks to enable asynchronous data offload and concurrent multi-simulation analysis on high-performance computing platforms.

SeerX is a scalable, in-transit in situ service for comprehensive exploration of simulation data on high-performance computing platforms. Its architecture is designed to address the storage, resource allocation, and post hoc analysis bottlenecks arising from extreme-scale simulations, such as cosmological codes with particle counts far exceeding available disk and node-local storage. The system integrates elastic resource provisioning, lossy and lossless data reduction, asynchronous computation decoupling, and interactive visualization services to manage, compress, and analyze massive scientific datasets.

1. Motivation and Core Design Principles

SeerX was developed in response to the constraints of traditional in situ analysis workflows, which require explicit pre-selection of visualization parameters and resource allocation before the simulation begins. As simulations outpace storage capabilities, the need for dynamic analysis during execution, with minimal a priori decisions, becomes acute. SeerX provides a mechanism for simulations to offload data in transit to a remote or local service infrastructure—bypassing the need for MPI-based synchronization—while also enabling flexible, interactive analysis downstream. The architecture supports concurrent analysis of multiple simulations and elastic scaling of backend resources.

2. Infrastructure Components and Workflow

SeerX leverages the Mochi framework, incorporating key components listed below:

Mochi Component Functionality Role in SeerX
Mercury Remote Procedure Call (RPC) layer Data communication and offload
Argobots Task-based parallelism (user-level threads) Efficient distributed tasking
Margo Service abstraction and coordination Management of distributed nodes
Yokan Key-value storage backend, instantiated via Bedrock Storage of simulation and meta

Simulations configure their offloading targets through a JSON file, specifying “sim-id,” multiple database addresses, and desired analysis options. The system’s dynamic resource allocation is realized by enabling simulation processes to register databases on demand, with backend analysis nodes scaling elastically. Resource de-/allocation is performed via standard TCP-based RPC calls, enabling runtime changes without requiring simulation-side MPI communicator coordination.

3. Data Reduction: Lossy and Lossless Compression

To maximize I/O throughput and minimize storage, SeerX integrates both lossless and lossy compression algorithms:

  • Lossless Compression: BLOSC, typically for precise data such as particle IDs.
  • Lossy Compression: SZ3, with absolute error bounds (e.g., 0.003 for particle positions) to achieve ~4× reduction without sacrificing scientific integrity.
  • Compression Ratio Calculation:

Compression Ratio=Original Data SizeCompressed Data Size\text{Compression Ratio} = \frac{\text{Original Data Size}}{\text{Compressed Data Size}}

  • Quality Validation: For lossy compression, SeerX employs the SSIM (Structural Similarity Index Measure), with values 0.9\geq 0.9 observed, confirming visual fidelity and feature preservation.

Configuration of compression parameters is accomplished through JSON, allowing precise control for each simulation variable and facilitating adjustments at runtime.

4. Asynchronous In-Transit Service and Visualization Decoupling

SeerX’s in-transit paradigm explicitly separates simulation computation from analysis, overcoming the conventional need to synchronize via MPI. Simulation data is transmitted asynchronously to one or more Yokan databases managed by Mochi, allowing simulations to continue without waiting for analysis completion. This separation reduces contention, supports simultaneous analysis from multiple simulations, and provides resilience against fluctuating analysis workloads.

Visualization and analysis are further decoupled: the simulation merely offloads compressed snapshots and required metadata. Analysts subsequently connect interactively (e.g., via a Trame-augmented Jupyter notebook) to these databases to explore, visualize, and post-process datasets without having committed to fixed visualization settings in advance.

5. Technical Implementation and Formulae

  • Data Retrieval Example Call:

1
getData(timestep, rank, variable, nElements, metadata)
This abstraction presents decompressed data to the analysis consumer through the in-transit service.

  • SSIM Formula:

SSIM(x,y)=(2μxμy+C1)(2σxy+C2)(μx2+μy2+C1)(σx2+σy2+C2)SSIM(x, y) = \frac{(2\mu_x\mu_y + C_1)(2\sigma_{xy} + C_2)}{(\mu_x^2 + \mu_y^2 + C_1)(\sigma_x^2 + \sigma_y^2 + C_2)}

With μx\mu_x, μy\mu_y as means, σx2\sigma_x^2, σy2\sigma_y^2 as variances, σxy\sigma_{xy} as covariance, and C1C_1, C2C_2 as stability constants.

The configuration (nodes, analysis options, compression schemes) is encoded in simple, reusable JSON files to promote ease of use across simulation codes.

6. Use Cases, Benchmarking, and Scalability

SeerX has been validated on cosmological simulations with the HACC code, handling up to 189 million particles per run. With SZ3 lossy compression at an absolute error of 0.003, the particle coordinate field data was reduced by a factor of approximately four while preserving high SSIM scores.

Benchmarks demonstrate:

  • Ability to run multiple concurrent simulations with shared analysis infrastructure.
  • Elastic backend scaling for analysis loads; nodes added as needed without static provisioning.
  • Decoupled simulation/analysis, minimizing blocking and latency, and supporting interactive post hoc exploration.
  • Applicability to HPC scenarios with dynamic compute allocations and simultaneous multi-task analysis requirements.

7. Significance, Implications, and Prospective Extensions

SeerX introduces a generalizable and technically robust solution for large-scale simulation analysis. By dynamically processing, compressing, and offloading data in transit, it relieves both I/O bottlenecks and inflexibility of traditional in situ workflows. The approach is highly adaptable to future compute paradigms emphasizing asynchronous, service-oriented analysis, and interactive, user-driven post hoc exploration.

A plausible implication is the applicability of SeerX’s architecture to other scientific domains where concurrent simulation, analysis, and visualization are needed and where resource requirements and visualization targets cannot be predetermined. The design supports extension to further data-reduction schemes and alternative post-processing modalities.

SeerX thus offers a modular, efficient, and scalable solution to the central challenges of data explosion in computational science. Its architecture and results show practical viability for HPC users requiring dynamic analysis infrastructure and interactive exploration for contemporary, data-intensive simulation workloads (Grosset et al., 19 Oct 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to SeerX.