Papers
Topics
Authors
Recent
2000 character limit reached

Structural Validity Rate (SVR) Overview

Updated 28 December 2025
  • Structural Validity Rate (SVR) is a metric that quantifies the structural integrity of FlexScript code by assessing process graph connectivity and correct object declarations.
  • SVR combines a Connection Score and an Object Score using fixed weights, penalizing missing or incorrect links and declarations in simulation scripts.
  • Empirical evaluations demonstrate near-perfect SVR values in advanced models, emphasizing its key role in validating digital twin systems for industrial applications.

Structural Validity Rate (SVR) is a quantitative evaluation metric introduced by the authors of "Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems" (Hsu et al., 23 Dec 2025) to assess the structural integrity of FlexScript code generated by vision-LLMs for industrial simulation systems. SVR provides a principled measure of whether the topology and object declarations within a generated script faithfully reproduce the required process graph and its constituent components, supporting rigorous multimodal benchmarking across diverse simulation layouts.

1. Formal Definition and Mathematical Construction

SVR is defined as a weighted sum of two sub-scores: the Connection Score (CS) and the Object Score (OS). Each is formalized as follows:

  • Let NN denote the total number of ground-truth “contextdragconnection(src, dst)” statements, and MM the number of those connections correctly reproduced in the generated script. The Connection Score is:

CS=MNCS = \frac{M}{N}

  • Let KK represent the total number of distinct objects required (e.g., sources, queues, processors), and KK' the number whose type and name match the ground truth exactly (case-sensitive). The Object Score is:

OS=KKOS = \frac{K'}{K}

  • SVR combines these via fixed weights, prioritizing connectivity:

SVR=0.6CS+0.4OSSVR = 0.6 \cdot CS + 0.4 \cdot OS

This construction ensures higher penalization for incorrect or missing link topology, reflecting its criticality in process graph validity.

2. Component Interpretation and Structural Constraints

A valid structure under SVR is assessed against two constraints:

  • Connectivity: The model must reproduce every “contextdragconnection(src, dst)” present in the reference script, with exact source and destination identifiers. Missing, spurious, or permuted links count as errors and decrement M accordingly.
  • Declaration Correctness: Each required object must be declared by both the correct type (e.g., “/source,” “/processor”) and an exact (case-sensitive) match of its name. Deviations in type or naming, including case discrepancies, decrease KK'.

Violation detection is explicit:

  • Missing or spurious connections decrease MM or increase NMN-M.
  • Incorrect or absent object declarations decrease KK' or increase KKK-K'.

3. Evaluation Procedure and Implementation

SVR is computed per FlexScript instance using the ground-truth data structures GTconnGT_{conn} for connections and GTobjGT_{obj} for object declarations via the following procedure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
function compute_SVR(predicted_script, GT_conn, GT_obj):
    # 1. Parse predicted_script for all contextdragconnection calls
    Pred_conn = extract_connections(predicted_script)
    # 2. Parse predicted_script for object declarations
    Pred_obj = extract_objects(predicted_script)
    # 3. Connection score
    M = cardinality(intersection(Pred_conn, GT_conn))
    N = cardinality(GT_conn)
    CS = M / N if N > 0 else 1.0
    # 4. Object score
    Kp = cardinality(intersection(Pred_obj, GT_obj))
    K  = cardinality(GT_obj)
    OS = Kp / K if K > 0 else 1.0
    # 5. Combine
    SVR = 0.6 * CS + 0.4 * OS
    return SVR

Parsing (“extract_connections” and “extract_objects”) may be implemented via regular expressions or a FlexScript-specific parser targeting the relevant DSL command patterns.

4. Empirical Results and Model Comparisons

SVR was used extensively for ablation and benchmarking across model architectures. The following empirical results were reported for both text-only and multimodal model variants:

Model Variant Encoder/Connector Best SVR Value
TinyLLaMA-1.1B (Text-only) 0.9444
StarCoder2-7B (Text-only) 0.9905
TinyLLaMA-1.1B+CLIP Linear Projection 0.8911
TinyLLaMA-1.1B+OpenCLIP Linear Projection 0.9408
StarCoder2-7B+CLIP Linear Projection 0.9958
StarCoder2-7B+OpenCLIP Two-Layer MLP 0.9990

“Near-perfect SVR” (>0.99) denotes almost flawless reproduction of both connection topology and object declarations in FlexScript, resulting in no topological errors for the evaluated industrial layout scripts.

5. Contextualization: SVR vs. PMR and ESR

SVR serves a distinct role among the three core metrics proposed:

  • SVR (Structural Validity Rate): Assesses the shape and topology of the process graph—connectivity and presence of required objects.
  • PMR (Parameter Match Rate): Evaluates parameter fidelity, focusing on the exact match of names, numerical, or distributional values (e.g., exponential(10)).
  • ESR (Execution Success Rate): Measures end-to-end executability—whether the generated script compiles and runs in FlexSim without errors.

SVR is the definitive metric for structural correctness; failures in topology (omitted or misplaced connections) render process simulation logic invalid irrespective of perfect parameter values.

6. Thresholds, Scaling Behaviors, and Usage Guidance

SVR is reported as a continuous value in the range [0,1][0, 1] without an explicit pass/fail threshold. For practical deployment, a cutoff such as SVR 0.95\geq 0.95 may be adopted to define “valid” outputs. The metric scales naturally with layout complexity: as the count of required connections (NN) and objects (KK) increases, each omission or error produces a correspondingly larger penalty.

In the referenced study, SVR was tracked across approximately 6,000 examples representing a 5% test split, covering a diverse array of industrial layouts (linear, U-shaped, parallel, etc.). Consistency of SVR scores throughout this range demonstrates metric robustness with scaling and dataset complexity.

7. Practical Implications for Generative Digital Twin Systems

SVR provides a direct, quantitative safeguard in generative simulation pipelines, ensuring that outputs exhibit high-fidelity reproduction of industrial process topologies and object inventories. As generative models increasingly supplant manual scripting in industrial settings, SVR’s discriminative sensitivity to both connectivity and declaration accuracy makes it indispensable for automatic validation and error localization in complex cross-modal generation tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Structural Validity Rate (SVR).