ZeroSim: Transformer Analog Modeling
- ZeroSim is a transformer-based analog circuit performance modeling framework that provides zero-shot prediction of circuit metrics across diverse amplifier topologies.
- It employs a hierarchical graph attention encoder with progressive parameter injection and a global topology token to capture both local and global circuit behaviors.
- Integrated in RL-based device sizing, ZeroSim drastically accelerates design iterations, achieving up to 13× speedup over traditional SPICE simulations.
ZeroSim is a transformer-based analog circuit performance modeling framework designed for robust, zero-shot prediction of circuit metrics across unseen amplifier topologies, eliminating the need for topology-specific retraining or manual fine-tuning. It addresses the major bottlenecks of circuit evaluation in analog design automation by replacing expensive SPICE simulations with a unified, data-driven surrogate that demonstrates both in-distribution and zero-shot generalization. The core innovation of ZeroSim lies in its architectural integration of hierarchical graph attention, progressive parameter injection, and a globally-conditioned topology embedding, trained on an extensive corpus spanning a wide variety of circuit structures and parameter configurations.
1. Model Architecture
ZeroSim operates by translating a circuit schematic into a pin-level undirected graph , where each node represents a pin (e.g. drain, gate, source, and bulk of a MOSFET) and edges represent both physical wiring and virtual device-centric connectivity. This graph representation captures both inter-pin and intra-device relationships. A global token is introduced to aggregate and propagate the entire circuit’s context:
Initial node embeddings consist of pin-type and device-type look-ups concatenated with the learnable global token.
The core encoder alternates two modes across transformer layers:
- Structure-Level Refining: Attention is masked () so each pin interacts only with wire-connected and co-device pins:
- Context-Level Enhancing: Uses full attention, enabling all pins and to interact without masking:
Parameter tokens representing each device parameter are progressively injected in dedicated layers via device-masked cross-attention:
This preserves topology-agnostic structural encoding in the lower encoder layers and introduces parameter awareness only after sufficient structure extraction.
A query-based decoder comprises independent learnable tokens , one per target performance metric. These attend over encoder outputs using two transformer layers, and the outputs are individually mapped to metric predictions via a linear head.
2. Enabling Strategies
ZeroSim’s generalization and scalability arise from three principal strategies:
- Large-Scale Training Corpus: The framework is trained on 3.6 million circuit instances, representing over 60 amplifier topologies with device counts ranging from 6 to 39. Each topology is parameterized using ranges suitable for the Sky130 PDK: m, m, , pF, k, A. For each, 60,000 random parameter sets are simulated, generating ground-truth for 11 key metrics (power, DC gain, GBW, phase margin, slew rate, settling time, CMRR, PSRR+, PSRR–, offset, temp coeff).
- Unified Topology Embeddings: The encoder uses a global-aware token and hierarchical attention that alternately restricts and enhances information flow, abstracting both local device/pin interactions and global circuit behaviors. This approach allows the model to embed any pin-level graph derived from a schematic, independent of topology.
- Topology-Conditioned Parameter Mapping: By strictly separating structure-only encoding from parameter fusion, ZeroSim ensures that the learned structural representation generalizes across topologies and is not specific to any single parameterization. Device-masked cross-attention maintains locality when injecting parameters, ensuring a consistent mapping for each device regardless of topology.
3. Training Procedure and Evaluation Metrics
ZeroSim is trained using mean absolute percentage error (MAPE) as the primary loss:
Adam optimizer is employed with , cosine decay, and gradient clipping. Batch size is 256, and training is conducted for 200 epochs on dual A100 GPUs. Metric normalization is applied per metric using train-set statistics.
Performance is also reported using an accuracy metric, (with ), which computes the fraction of metric errors within :
4. Experimental Results and Comparative Analysis
Empirical results demonstrate that ZeroSim achieves superior performance for both in-distribution and zero-shot settings. The following table summarizes key quantitative results (from Table III):
| Model | MAPE↓ (Zero-shot) | Acc@10↑ (Zero-shot) |
|---|---|---|
| MLP | 0.451 | 0.015 |
| GCN | 0.256 | 0.367 |
| DeepGEN | 0.214 | 0.493 |
| GTN | 0.192 | 0.542 |
| ZeroSim | 0.143 | 0.645 |
ZeroSim achieves a 33% relative reduction in zero-shot MAPE versus the strongest GNN baseline (DeepGEN: 0.214 ZeroSim: 0.143), and a 20% improvement in zero-shot (0.542 0.645) over graph transformer models.
Limitations of prior work are evident: MLPs perform poorly due to lack of structure modeling; GCNs and DeepGEN improve on local feature aggregation but struggle with novel topology generalization; GTN attention over graph nodes enhances encoding for long-range dependencies but does not fully capture unseen configurations. ZeroSim’s hierarchical encoder and global topology token enable substantive gains in robustness across topologies.
5. Integration in Reinforcement Learning-Based Device Sizing
ZeroSim functions as a surrogate evaluator within RL-based optimization loops, specifically AnalogGym. The standard workflow involves:
- At each episode, an RL policy samples a parameter set for the current topology.
- ZeroSim predicts circuit metrics, providing rapid inference for the reward computation: .
- The RL agent (PPO/REINFORCE) updates using policy gradient on .
- Periodically, ground-truth is obtained from full SPICE simulation for bias monitoring.
Pseudocode:
1 2 3 4 5 6 7 8 9 10 11 12 |
initialize policy π_θ
for episode = 1…N do
sample parameter set x ← π_θ(·)
ŷ ← ZeroSim.predict(graph, x)
r ← FoM(ŷ)
update π_θ via PPO/REINFORCE on r
occasionally (every M steps):
y_spice ← SPICE.simulate(graph, x)
r_spice ← FoM(y_spice)
log |r − r_spice|
end
final x* validated by SPICE |
In evaluation on an unseen 10-device amplifier (NMCF), ZeroSim accelerates RL convergence (FoM 0) by versus SPICE, where FoM is a scalar summary of design metric constraint satisfaction ($0$ is best).
6. Practical Implications and Future Directions
ZeroSim’s approach—comprising a large heterogeneous training set, unified hierarchical transformer graph encoder, global circuit token, and parameter injection—permits scalable and adaptable analog circuit evaluation, significantly reducing the computational cost of design iterations. It enables orders-of-magnitude speedup in RL-based sizing workflows, with generalization guarantees supported by strict separation of topology and parameter representations.
A plausible implication is that such transformer-based surrogates could extend to broader classes of circuit generative and optimization tasks, provided training corpora are sufficiently expressive. Conversely, generalization is constrained by the diversity and abstraction level of graph representations; circuit types with fundamentally new structural motifs may require retraining or encoder architecture refinement.
ZeroSim represents a current state-of-the-art model for circuit metric surrogates, outperforming MLP, GNN, and graph transformer baselines in both accuracy and adaptability for analog amplifier topologies. Its integration within RL design automation loops provides substantial efficiency improvements over traditional simulation-driven optimization, and its generalizing architecture suggests applicability to new circuits without manual substructure engineering or fine-tuning.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free