Core Refined Understanding eXpression (CRUX)

Updated 2 December 2025

CRUX is a grammar-based intermediate representation defined by three key components: Module Interface, Core Functions, and Key Considerations.
It employs a two-stage learning process combining supervised and reinforcement methods to optimize Verilog generation accuracy and reduce semantic drift.
CRUX’s modular design enables its use as a plug-in artifact for various models, improving specification-to-RTL tasks across hardware synthesis benchmarks.

Core Refined Understanding eXpression (CRUX) is a grammar-based, structured intermediate representation designed to bridge the semantic gap between open-ended natural language hardware specifications and domain-specific Verilog implementations. Developed as part of the QiMeng-CRUX framework, CRUX organizes essential design intent and constraints to enable precise, robust, and transferable code generation for hardware description languages, particularly Verilog (Huang et al., 25 Nov 2025).

1. Formal Structure and Components

CRUX is formally defined as a three-component template:

$MI$ (Module Interface): Specifies all module ports and parameters.
$CF$ (Core Functions): States the essential state-transition and data-flow logic central to the hardware’s function.
$KC$ (Key Considerations): Enumerates critical, often subtle constraints, such as reset policies and timing assumptions.

A CRUX instance $x = (x_1, ..., x_{L_x})$ for a natural language prompt $r$ factorizes into three contiguous spans:

$x = x^{\mathrm{MI}} \;\|\; x^{\mathrm{CF}} \;\|\; x^{\mathrm{KC}}$

Each span $x^{\mathrm{MI}}$ , $x^{\mathrm{CF}}$ , $x^{\mathrm{KC}}$ is restricted to a context-free grammar. Tokens are drawn from a compact intermediate vocabulary tailored for expressiveness and slot-filling semantics.

2. Syntax and Semantic Specification

CRUX expressions must conform to a concise BNF/EBNF, which ensures unambiguous, minimal, and fully structured encoding of hardware semantics. The top-level template is as follows:

<CRUX> ::= <ModuleInterface> <CoreFunctions> <KeyConsiderations>
<ModuleInterface> ::= "module" <ModuleName> "(" <PortList> ");"
<PortList> ::= <PortDecl> { "," <PortDecl> }
<PortDecl> ::= <Direction> <SignalName> [ "[" <Width> "]" ]
<Direction> ::= "input" | "output" | "inout"
<CoreFunctions> ::= "Core Functions:" <BehaviorList>
<BehaviorList> ::= <Behavior> { ";" <Behavior> }
<Behavior> ::= <StateClause> | <DataflowClause>
<StateClause> ::= "From" <StateName> ":" <ConditionActionList>
<ConditionActionList> ::= <ConditionAction> { "," <ConditionAction> }
<ConditionAction> ::= "On input" <Signal> "=" <Value> "," "transition to" <StateName>
<DataflowClause> ::= "Compute" <Expr> "given" <Inputs>
<KeyConsiderations> ::= "Key Considerations:" <ConstraintList>
<ConstraintList> ::= <Constraint> { ";" <Constraint> }

Each nonterminal symbol corresponds to a high-information semantic slot (e.g., <SignalName> as identifiers, <ConditionAction> as logical rules), supporting deterministic mapping to relevant hardware constructs.

3. Joint Supervised and Reinforcement Learning Framework

CRUX underlies a two-stage learning process:

Stage I: Joint Expression Modeling (JEM)

Uses a dataset $\mathcal{D} = \{(r, x, v)\}$ consisting of natural language prompts $r$ , CRUX expressions $x$ , and reference Verilog $v$ .
Minimizes the supervised cross-entropy loss:

$\mathcal{L}_{\mathrm{SFT}}(\theta) = -\sum_{(r,x,v)\in\mathcal{D}} \left[ \log p_\theta(x|r) + \log p_\theta(v|r,x) \right]$

Stage II: Dual-Space Optimization (DSO) via CRUX-Enhanced GRPO

Further optimizes $\theta$ via a policy-gradient method that rewards (a) functional correctness of generated Verilog, (b) usefulness of the generated CRUX.
Code reward: $R_{\mathrm{code}}(v') = \frac{\#\,\text{correct outputs}}{\#\,\text{test vectors}}$
CRUX reward:

$R_{\mathrm{crux}}(x') = \exp\left(\frac{1}{L} \sum_{i=1}^{L} \log p_\theta(y_i|r,x',y_{<i})\right)$

Total reward: $R(x', v') = \lambda_{\mathrm{code}} R_{\mathrm{code}}(v') + \lambda_{\mathrm{crux}} R_{\mathrm{crux}}(x')$ , typically with $\lambda_{\mathrm{code}}=0.6$ , $\lambda_{\mathrm{crux}}=0.4$ .
Optimization employs Group Relative Policy Optimization (GRPO) without KL-penalty, which increases output diversity.

4. Inference and Decoding Procedure

At inference time, a two-stage decode is performed:

CRUX Decoding: $x^* = \arg\max_x p_\theta(x|r)$
Verilog Decoding: $v^* = \arg\max_v p_\theta(v|r, x^*)$

In practice, either beam search or sampling $K$ candidates per stage are employed for computing pass@k metrics. The structured and concise CRUX format produces a highly concentrated distribution for Verilog generation, reducing semantic drift and increasing correctness.

5. Benchmark Performance and Empirical Results

QiMeng-CRUX-V was evaluated on VerilogEval-V1, VerilogEval-V2 (both CC and SR tracks), RTLLM-V1, and RTLLM-V2 benchmarks, primarily using the pass@1 metric. Key results include:

Model (Setting)	VerilogEval-V2 (SR)	RTLLM-V2
Baseline (OriGen-7B)	49.3%	—
+RealSpec only (robustness)	53.2%	—
+CRUX (structure)	59.6%	—
QiMeng-CRUX-V-SFT	59.6%	—
QiMeng-CRUX-V-Final (full DSO)	64.7% (+15.4%)	63.8% (+12.9%)
Qwen2.5-Coder	—	50.9%

Ablations indicate that RealSpec and the structured CRUX representation provide orthogonal improvements (+8.4% and +6.4%, respectively) on VerilogEval-V2 SR (Spec-to-RTL) tasks. Stage II optimizations with CRUX-reward benefit both functionality and informativeness, increasing SR from 59.6% (no CRUX-reward) to 64.7% (with CRUX-reward).

CRUX also demonstrates notable transferability: appending the learned CRUX to arbitrary code models, such as Qwen2.5-Coder, boosts pass@1 for SR from 22.0% (“Only CRUX”) to 35.9%, and further to 37.8% when augmented with the original description (“Des+CRUX”). This suggests CRUX acts as a model-agnostic semantic scaffold.

6. Significance and Reusability

CRUX offers a principled, grammar-based, three-slot scaffolding that captures user intent, enables joint modeling of intermediate intent and code, and supports RL-based sharpening of code-generation distributions. Its main empirical effect is dramatic reduction of semantic drift and increased success rate on specification-to-RTL tasks. The modular structure of CRUX allows its use as a plug-in artifact for other LLMs without retraining, providing broad architectural transferability and immediate improvements in code generation quality. The approach achieves state-of-the-art results across all major Verilog code-generation benchmarks and sets a new technical standard for prompt engineering and intermediate representation in hardware synthesis via LLMs (Huang et al., 25 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

QiMeng-CRUX: Narrowing the Gap between Natural Language and Verilog via Core Refined Understanding eXpression (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Core Refined Understanding eXpression (CRUX).