Structural Validity Rate (SVR) Overview
- Structural Validity Rate (SVR) is a metric that quantifies the structural integrity of FlexScript code by assessing process graph connectivity and correct object declarations.
- SVR combines a Connection Score and an Object Score using fixed weights, penalizing missing or incorrect links and declarations in simulation scripts.
- Empirical evaluations demonstrate near-perfect SVR values in advanced models, emphasizing its key role in validating digital twin systems for industrial applications.
Structural Validity Rate (SVR) is a quantitative evaluation metric introduced by the authors of "Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems" (Hsu et al., 23 Dec 2025) to assess the structural integrity of FlexScript code generated by vision-LLMs for industrial simulation systems. SVR provides a principled measure of whether the topology and object declarations within a generated script faithfully reproduce the required process graph and its constituent components, supporting rigorous multimodal benchmarking across diverse simulation layouts.
1. Formal Definition and Mathematical Construction
SVR is defined as a weighted sum of two sub-scores: the Connection Score (CS) and the Object Score (OS). Each is formalized as follows:
- Let denote the total number of ground-truth “contextdragconnection(src, dst)” statements, and the number of those connections correctly reproduced in the generated script. The Connection Score is:
- Let represent the total number of distinct objects required (e.g., sources, queues, processors), and the number whose type and name match the ground truth exactly (case-sensitive). The Object Score is:
- SVR combines these via fixed weights, prioritizing connectivity:
This construction ensures higher penalization for incorrect or missing link topology, reflecting its criticality in process graph validity.
2. Component Interpretation and Structural Constraints
A valid structure under SVR is assessed against two constraints:
- Connectivity: The model must reproduce every “contextdragconnection(src, dst)” present in the reference script, with exact source and destination identifiers. Missing, spurious, or permuted links count as errors and decrement M accordingly.
- Declaration Correctness: Each required object must be declared by both the correct type (e.g., “/source,” “/processor”) and an exact (case-sensitive) match of its name. Deviations in type or naming, including case discrepancies, decrease .
Violation detection is explicit:
- Missing or spurious connections decrease or increase .
- Incorrect or absent object declarations decrease or increase .
3. Evaluation Procedure and Implementation
SVR is computed per FlexScript instance using the ground-truth data structures for connections and for object declarations via the following procedure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
function compute_SVR(predicted_script, GT_conn, GT_obj):
# 1. Parse predicted_script for all contextdragconnection calls
Pred_conn = extract_connections(predicted_script)
# 2. Parse predicted_script for object declarations
Pred_obj = extract_objects(predicted_script)
# 3. Connection score
M = cardinality(intersection(Pred_conn, GT_conn))
N = cardinality(GT_conn)
CS = M / N if N > 0 else 1.0
# 4. Object score
Kp = cardinality(intersection(Pred_obj, GT_obj))
K = cardinality(GT_obj)
OS = Kp / K if K > 0 else 1.0
# 5. Combine
SVR = 0.6 * CS + 0.4 * OS
return SVR |
Parsing (“extract_connections” and “extract_objects”) may be implemented via regular expressions or a FlexScript-specific parser targeting the relevant DSL command patterns.
4. Empirical Results and Model Comparisons
SVR was used extensively for ablation and benchmarking across model architectures. The following empirical results were reported for both text-only and multimodal model variants:
| Model Variant | Encoder/Connector | Best SVR Value |
|---|---|---|
| TinyLLaMA-1.1B | (Text-only) | 0.9444 |
| StarCoder2-7B | (Text-only) | 0.9905 |
| TinyLLaMA-1.1B+CLIP | Linear Projection | 0.8911 |
| TinyLLaMA-1.1B+OpenCLIP | Linear Projection | 0.9408 |
| StarCoder2-7B+CLIP | Linear Projection | 0.9958 |
| StarCoder2-7B+OpenCLIP | Two-Layer MLP | 0.9990 |
“Near-perfect SVR” (>0.99) denotes almost flawless reproduction of both connection topology and object declarations in FlexScript, resulting in no topological errors for the evaluated industrial layout scripts.
5. Contextualization: SVR vs. PMR and ESR
SVR serves a distinct role among the three core metrics proposed:
- SVR (Structural Validity Rate): Assesses the shape and topology of the process graph—connectivity and presence of required objects.
- PMR (Parameter Match Rate): Evaluates parameter fidelity, focusing on the exact match of names, numerical, or distributional values (e.g., exponential(10)).
- ESR (Execution Success Rate): Measures end-to-end executability—whether the generated script compiles and runs in FlexSim without errors.
SVR is the definitive metric for structural correctness; failures in topology (omitted or misplaced connections) render process simulation logic invalid irrespective of perfect parameter values.
6. Thresholds, Scaling Behaviors, and Usage Guidance
SVR is reported as a continuous value in the range without an explicit pass/fail threshold. For practical deployment, a cutoff such as SVR may be adopted to define “valid” outputs. The metric scales naturally with layout complexity: as the count of required connections () and objects () increases, each omission or error produces a correspondingly larger penalty.
In the referenced study, SVR was tracked across approximately 6,000 examples representing a 5% test split, covering a diverse array of industrial layouts (linear, U-shaped, parallel, etc.). Consistency of SVR scores throughout this range demonstrates metric robustness with scaling and dataset complexity.
7. Practical Implications for Generative Digital Twin Systems
SVR provides a direct, quantitative safeguard in generative simulation pipelines, ensuring that outputs exhibit high-fidelity reproduction of industrial process topologies and object inventories. As generative models increasingly supplant manual scripting in industrial settings, SVR’s discriminative sensitivity to both connectivity and declaration accuracy makes it indispensable for automatic validation and error localization in complex cross-modal generation tasks.