ChipKG: Graph-Based AMS Circuit Design

Updated 23 March 2026

ChipKG is a comprehensive knowledge graph that integrates annotated AMS circuit data, SPICE netlists, and multi-level annotations for automated design synthesis.
It employs a retrieval-augmented generation pipeline with LLMs and Cypher query indexing to accurately retrieve and assemble circuit topologies.
The framework couples Bayesian optimization with simulation feedback to ensure robust transistor sizing and iterative design regeneration.

ChipKG, exemplified in the AMSnet-KG system, is a comprehensive knowledge graph-driven dataset and retrieval-augmented generation (RAG) framework designed for automating analog and mixed-signal (AMS) circuit design with LLMs. Targeting the mitigation of LLM hallucinations in electronic design automation (EDA), ChipKG compiles AMS circuit netlists, schematics, and detailed multi-level annotations into a graph-structured database. Analogous to established knowledge graph modeling in other domains, ChipKG’s graph-centric paradigm merges high-quality curation with automated reasoning to facilitate topology synthesis, parameter optimization, and closed-loop design flows from human-provided specifications (Shi et al., 2024).

1. Knowledge Graph Construction and Dataset Organization

AMSnet-KG comprises 894 explicitly annotated AMS designs, including representative families such as five-transistor operational amplifiers, telescopic and cascode op-amps, strong-arm and double-tail latch comparators, bandgap references, low-dropout regulators (LDOs), and ADC front ends. Each design is supplied as a SPICE-compatible netlist (with .include statements, component lines, and SPICE commands such as .OP, .AC, .TRAN) capturing device names and connectivity prior to sizing. Extensions of the dataset include DC, AC, and transient testbenches covering common AMS metrics such as gain, gain-bandwidth product (GBW), phase margin (PM), common-mode rejection ratio (CMRR), power-supply rejection ratio (PSRR), offset, and propagation delay.

The annotation schema is partitioned into local (component-level) and global (circuit-level) descriptors:

Local annotation attaches pin functions, building-block labels (e.g., “differential pair,” “cascode current mirror”), expert sizing constraints (e.g., device symmetry, W/L ratios), and net labels for wiring automation.
Global annotation describes structural templates (“two-stage amplifier,” “cascode load”), qualitative performance tags (“high PSRR”), and pros/cons (e.g., “low phase margin”).

Entities within the knowledge graph encompass: circuit nodes, testbench nodes, and string nodes (annotation keys/values). Relation types are implemented as simple, typed strings (e.g., uses, input, output, evaluates, hasConstraint), and data are stored and queried in a unified Neo4j instance. Triplets of the form ⟨entity₁, relation, entity₂⟩ serve as the fundamental edge representation and are derived from annotated JSONs, which are programmatically collapsed into subgraphs to maximize structural consistency (see (Shi et al., 2024), Fig. 9c).

2. RAG-Based Retrieval and LLM-Driven Design Flow

The AMSgen framework operationalizes the graph via a RAG pipeline with four primary stages: (1) specification-driven prompting, (2) triplet extraction, (3) subgraph retrieval, and (4) netlist assembly. Unlike standard embedding-based retrieval, the system eschews vector embeddings and employs direct indexing on relation-query triplets. A Cypher query index targets properties such as “input=differential,” “load=current_mirror,” ensuring precise, deterministic lookups.

The LLM (GPT-4) receives in-context learning (ICL) and chain-of-thought (CoT) prompts to analyze input performance requirements, propose circuit architecture at the building block level, and output a set of relation triplets in fixed format (see Figs. 11–12). The system then constructs a “query graph” from these triplets and executes a Neo4j pattern match (e.g., MATCH (c:circuit {input:'Differential'})–[:load]->…) to retrieve candidate subgraphs. Subsequent code-driven fusion yields a scaffolded, unsized SPICE netlist; the LLM is only re-invoked should sizing and simulation fail to meet performance constraints.

3. Schematic Synthesis, Topology Assembly, and Testbench Integration

Netlist synthesis integrates modular building blocks (differential pairs, current mirrors, Miller compensation) according to graph-retrieved templates and local annotations. Wiring is programmatically directed by net labels, enabling correct linkage of stage outputs, bias nets, and compensation components. Testbenches are similarly mapped onto the assembled topology, with metrics and stimulus setups appropriately keyed to annotation tags.

The retrieved and assembled netlists remain unsized after topology synthesis. Subsequently, device parameters (length, width, fingers) are resolved via automated sizing methods downstream (see Section 4).

4. Transistor Sizing and Black-Box Bayesian Optimization

Device sizing is formalized as a black-box, constraint-constrained optimization problem, with the objective of maximizing a figure-of-merit (FoM):

$\text{maximize} \ \mathrm{FoM}(x) \ \text{s.t.} \ g_j(x) \leq 0, \ j=1,\ldots,p$

where $x \in \mathbb{R}^d$ are device parameters (L, W, fingers, R, C). The FoM is a weighted sum:

$\mathrm{FoM}(x) = \sum_{i=1}^N w_i \frac{\min(f_i(x), f_i^{bnd}) - f_i^{min}}{f_i^{max} - f_i^{min}}$

with $f_i(x)$ as simulated metrics and $w_i$ as metric importance weights.

A Gaussian process (GP) prior is used for surrogate modeling (see Equations 8 and 9), and expected improvement (EI) guides sequential sampling:

$a(x) = \mathbb{E}[\max(0, f(x) - f(x^+))]$

The process follows an initial random sampling phase, then alternates GP updates and EI-guided parameter selection (Algorithm 1 in (Shi et al., 2024)). Local annotation constraints (e.g., $W_{M1} = W_{M2}$ , $L_{M3}/L_{M4} = 2$ ) are imposed to reduce dimensionality (typically from 19+ variables down to 10–15).

Each BO iteration launches a Spectre simulation via a Python driver, returning metric values to be incorporated in GP updates. If no feasible solution is discovered after the budget is exhausted, the LLM is re-prompted for topology regeneration, and the process repeats (see Fig. 10 pipeline).

5. Simulation–Driven Feedback and Topology Regeneration

The design pipeline is inherently closed-loop: Bayesian optimization is tightly coupled with simulation, and performance feedback is used to trigger design revisions. Should sizing be unsuccessful, prompt engineering supplies the LLM with examples of current (underperforming) topology and achieved metrics, prompting generation of an alternative topology more likely to yield a compliant solution. Recovered building blocks are again extracted from the knowledge graph, and full simulation–sizing–regeneration cycles proceed as needed. This bootstrapped loop improves robustness over prompt-only or dataset-only approaches.

6. Case Studies and Metrics

Two canonical case studies illustrate typical ChipKG utility:

(i) Two-Stage Operational Amplifier

Initial specification: Gain >80 dB, CMRR >80 dB, PSRR >80 dB, GBW >10 MHz, PM >60°, CL = 100 pF.
Pipeline retrieves five-transistor differential pair op-amp, common-source gain stage, bias, and R–C compensation modules.
Free parameters: 19 (reduced to 15 via annotation-imposed constraints).
BO (with annotation) converges to FoM = 3.40: Gain = 66.2 dB, CMRR = 54.2 dB, PSRR = 69.8 dB.
Topology regeneration prompts the LLM to propose a telescopic cascode architecture, with subsequent BO meeting all specifications: Gain = 80.8 dB, CMRR = 99.0 dB, PSRR = 91.8 dB.

(ii) Strong-Arm Latch Comparator

Specification: $f_{samp} = 1$ GHz, offset <100 μV, delay ≤1 ns, power ≤100 μW.
LLM differentiates between strong-arm and double-tail, choosing the former for optimal speed/power trade-off.
Sizing variables: 22 (reduced to 10 with symmetry constraints).
Final performance: offset = 35 μV, $t_{pd} = 11.9$ ps, $x \in \mathbb{R}^d$ 0 W.

In both cases, convergence plots (Figs. 14, 16, 18) document accelerated and stabilized optimization trajectories when design and sizing constraints are enforced via multi-level annotation.

7. Open Access, Extensions, and Limitations

AMSnet-KG, the core instantiation of ChipKG in (Shi et al., 2024), will be released under a permissive academic license upon publication. Proposed extensions include further expansion of circuit families (e.g., LDOs, ADC-SAR, filters), addition of corners and Monte Carlo simulations, and PDK parameter integration. Noted limitations include extraction bottlenecks due to reliance on printed schematics (96% recall) and simulation expense in BO ( ≈2000 SPICE runs per design). The LLM's quantitative node-specific performance priors remain weak, necessitating iterative topology regeneration.

Key figures in (Shi et al., 2024) highlight the ChipKG methodology and rationale:

Fig. 1: System overview (netlist, annotation, graph schema)
Figs. 13, 15, 17: Example block retrieval and assembled topologies
Figs. 14, 16, 18: BO convergence plots

The ChipKG paradigm, as realized in AMSnet-KG, demonstrates a unified pipeline that merges dense AMS circuit knowledge with LLM-driven synthesis and classic BO, closing the loop from high-level specifications to SPICE-verified, annotation-constrained netlists with minimal manual intervention (Shi et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

AMSnet-KG: A Netlist Dataset for LLM-based AMS Circuit Auto-Design Using Knowledge Graph RAG (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to ChipKG.