Papers
Topics
Authors
Recent
Search
2000 character limit reached

ArchGym: ML-Based Architecture Exploration

Updated 16 June 2026
  • ArchGym is an open-source, extensible framework for ML-driven architectural design space exploration, offering unified APIs and integrated baselines for fair benchmarking.
  • It employs an environment–agent loop supporting diverse search strategies like RL, BO, GA, ACO, and random sampling, highlighting the crucial role of hyperparameter tuning.
  • By standardizing experiment protocols and incorporating proxy models for simulation acceleration, ArchGym advances reproducibility and efficiency in architectural research.

ArchGym is an open-source, extensible framework designed to enable fair, reproducible, and objective comparison of ML-assisted algorithms for architectural design space exploration. Addressing the complexity stemming from the high dimensionality and combinatorial explosion of modern hardware architecture configuration spaces, ArchGym provides unified APIs and integrated baselines, thereby facilitating ML algorithm selection, hyperparameter tuning, benchmarking, and data collection for downstream research (Krishnan et al., 2023).

1. Motivation and Scope

Design-space exploration for domain-specific architectures—such as memory controllers, deep neural network (DNN) accelerators, and AR/VR system-on-chip (SoC) platforms—involves tuning dozens of discrete and continuous architectural "knobs," quickly leading to design spaces that surpass 101710^{17} configurations. Brute-force search is infeasible due to the combinatorial explosion. Traditional performance evaluation using cycle-accurate or register-transfer-level (RTL) simulators is computationally expensive, imposing tight sample budgets for algorithmic search.

Multiple ML-based optimizers have been proposed: reinforcement learning (RL), Bayesian optimization (BO), genetic and ant-colony algorithms, and random baselines. However, the lack of a common experimental environment, detailed hyperparameter treatments, and standardized benchmarks significantly impedes objective algorithm selection and slows research progress (Krishnan et al., 2023).

2. Framework Overview and API Design

ArchGym abstracts each design-space exploration problem as a standard "environment–agent" loop, mirroring the OpenAI Gymnasium interface. The two key entities are:

  • Environment: Wraps the target cost model, which can be an analytical model, high-fidelity simulator, ML-driven proxy, or physical hardware, alongside a suite of workloads.
  • Agent: Encapsulates the search strategy, parameterized by its own policy representation (e.g., neural networks, genomes, surrogate models, pheromone matrices) and hyperparameters.

Core API methods mirror the traditional RL paradigm:

  • reset() → returns initial observation
  • step(action) → returns (next_observation, reward, done, info)

Action spaces are flexible, supporting both discrete and continuous dimensions to encode architecture parameters (e.g., buffer_size ∈ {1,2,4,8}, PE_count ∈ [1,256]). Reward is scalar-valued, representing user-defined objectives such as minimizing α\alpha \cdotlatency +β+ \beta \cdotenergy +γ+ \gamma \cdotarea; non-RL agents typically ignore observations but fully utilize rewards as fitness measures.

ArchGym provides plug-in support for new agents (by subclassing and implementing select_action and update) and environments (by extending the environment base class with required API methods). All (observation, action, reward) trajectories are logged into extensible datasets (e.g., TFDS, RLDS), enabling offline RL or proxy modeling.

3. Integrated Search Algorithms and the Hyperparameter Lottery

ArchGym includes five representative agent types:

  • Reinforcement Learning (PPO, SAC, DDPG): Employs neural policy πθ(as)\pi_\theta(a|s) and standard RL objectives J(θ)=Eτπθ[R(τ)]J(\theta)=\mathbb{E}_{\tau\sim\pi_\theta}[R(\tau)] with mechanisms for exploration.
  • Bayesian Optimization (BO): Uses Gaussian Process surrogates, acquisition functions (e.g., UCB, Expected Improvement), and closed-loop optimization.
  • Genetic Algorithm (GA): Maintains a population of genomes with explicit exploitation/exploration balance via selection, crossover, and mutation.
  • Ant-Colony Optimization (ACO): Utilizes a pheromone matrix, stochastic design sampling, and pheromone decay for adaptive search.
  • Random Walker (RW): Baseline agent performing uniform random sampling.

A central empirical finding is the hyperparameter lottery: with unlimited simulator samples and exhaustive hyperparameter sweeps (typically over 4,000 configurations per agent), any agent family can match or surpass others, indicating that performance is determined as much by hyperparameter tuning as by algorithmic class. Under practical (limited) sample budgets (104\leq 10^4 calls), simpler methods (RW, GA) often match or outperform sophisticated ones (RL, BO), while with larger budgets (105\geq 10^5), RL's performance rises but other families remain competitive.

Experimentally, the interquartile range (IQR) of final rewards can be extremely broad, reaching 90% in DRAMGym and 40% in FARSIGym solely due to hyperparameter selection, underscoring the importance of transparent hyperparameter reporting and statistical rigor (Krishnan et al., 2023).

4. Experimental Platforms, Metrics, and Results

ArchGym natively implements multiple architecture-specific environments and benchmarks:

  • DRAMGym: DRAM controller optimization using DRAMSys; space of 1.9×1071.9\times10^7 configurations.
  • TimeloopGym: DNN accelerator optimization via Timeloop; 2×10142\times10^{14} configurations.
  • FARSIGym: Custom SoC (AR/VR) using FARSI; α\alpha \cdot0 configurations.
  • MaestroGym: DNN mapping with Maestro; up to α\alpha \cdot1 mappings per layer.

Reward functions allow for both single-objective and weighted multi-objective design targets. Sample budgets in published experiments spanned α\alpha \cdot2 to α\alpha \cdot3 simulator calls, with special attention to unlimited-sample (perfect tuning) scenarios.

Key findings:

  • No search algorithm dominates when unlimited samples and perfect tuning are allowed—each can reach user targets at least once.
  • Under constrained resources, random or evolutionary approaches compete strongly, while RL typically requires more samples to be effective.
  • Diversity of agent strategies and hyperparameter treatments drive performance variance far more than algorithm family alone (Krishnan et al., 2023).

5. Proxy Modeling for Simulation Acceleration

ArchGym's data aggregation allows the construction of ML-based proxy models. In the DRAMGym case study, a Random Forest regressor (trained using α\alpha \cdot4 data points from varied agents) yielded a normalized RMSE of approximately 0.61% across key targets (latency, dynamic power, energy), achieving α\alpha \cdot5 speedup over full simulation. Aggregating data from heterogeneous agent runs dramatically reduces proxy error—by up to α\alpha \cdot6 compared to single-agent datasets of equivalent size (Krishnan et al., 2023).

Proxy models can thus serve as high-throughput surrogates for expensive simulators, enabling more efficient or broader design space exploration.

6. Code Structure, Extensibility, and Integration

ArchGym's modular directory structure is organized as follows:

Directory Purpose Example Files
envs/ Simulator environment bindings DRAMGym.py, TimeloopGym.py
agents/ ML search algorithms ppo_agent.py, bo_agent.py, ga_agent.py
datasets/ Data logging/export pipelines loggers.py
utils/ Utility functions α\alpha \cdot7

Adding a new environment involves subclassing archgym.Env, defining the action and observation spaces, and implementing the standard API. New agents are created by subclassing archgym.Agent and exposing all hyperparameters via YAML or JSON for systematic sweeping.

Every experiment records complete configuration, trajectory, and outcome information, supporting full reproducibility. Tie-ins to Weights & Biases and TensorBoard provide dashboard-style experiment monitoring.

By providing a unified interface, detailed dataset export, and baseline implementations, ArchGym enables:

  • Objectively fair benchmarking of ML search algorithms across diverse architectural application domains.
  • Systematic collection/sharing of design-space datasets, supporting both online agent training and offline proxy modeling.
  • Plug-and-play extension for emerging ML optimization algorithms or architectural simulators.

ArchGym's agent-environment abstraction is compatible with software architecture benchmarking frameworks such as ArchBench (Adnan et al., 18 Mar 2026), supporting RL/agent-in-the-loop evaluation where agent outputs are scored using external pipelines or custom metrics.

Together, these features position ArchGym as a cornerstone for reproducible, data-driven research in the application of ML to architecture design (Krishnan et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to ArchGym.