Probability-Weighed Control Laws

Updated 13 December 2025

Probability-weighed control laws are mathematical strategies that blend probability measures with deterministic control to manage uncertainty in dynamic systems.
They employ statistical models and probabilistic computations to optimize decision-making processes and improve system reliability in variable environments.
Recent advancements demonstrate these laws can enhance test-case generation and fault tolerance, bridging theory with practical applications in system design.

Report: LLMCFG-TGen: Using LLM-Generated Control Flow Graphs to Automatically Create Test Cases from Use Cases

Formal Definition of the CFG Representation

1.1 Control Flow Graph (CFG) as a Mathematical Object We define a Control Flow Graph (CFG) as the directed graph G = (V, E, v₀, V_exit) where * V is a finite set of nodes (basic steps), * E ⊆ V × V is a set of directed edges (control transfers), * v₀ ∈ V is the unique start node (root), * V_exit ⊆ V is the set of exit (leaf) nodes.

Each node v ∈ V is labeled with a natural-language statement extracted from a use-case step (main, alternative, or exception). An edge (u, v) ∈ E represents a possible transition from step u to step v. Conditional edges are annotated with guard conditions g(u, v) ∈ {“true‐branch”, “false‐branch”} when they arise from an if-then-else or branching sentence.

1.2 Correspondence of V and E to Use-Case Elements

A main-flow sentence “System displays menu” becomes a node vᵢ ∈ V.
A conditional sentence “If the user is authenticated, then proceed; otherwise show error” yields one node for the condition check and two outgoing edges: one to the “proceed” node (true‐branch) and one to the “show error” node (false‐branch).
Alternative and exception flows are linked back into the main flow at specified join points.

1.3 Formulas for Coverage and Path Enumeration Given G, the set of all root-to-exit paths P can be enumerated. Let |P| denote the number of distinct paths. In the absence of loops,

|P| = ∑ over all branches ∏ (branch-fan-out)

More formally, if at k distinct decision nodes the fan-outs are d₁, d₂, …, d_k, then

|P| = ∏_{i=1}^k dᵢ.

Branch Coverage BC is defined as

BC = |E_exercised| / |E|.

Full path coverage implies BC = 1 and all node visits at 100%.

LLM-Based CFG Construction Algorithm

2.1 Prompt Engineering: Prompt #1 Prompt #1 to the LLM comprises five parts: 1. Role Definition (“You are a software‐engineering expert.”) 2. Task Instructions (how to extract steps and branches) 3. Algorithm Specification (see Algorithm 1 below) 4. Output Specification (“Return valid JSON with fields ‘nodes’ and ‘edges’.”) 5. Input: the raw NL use-case description.

2.2 Pseudocode for CFG Generation (Algorithm 1) Algorithm 1 formalizes the mapping of use-case steps into CFG nodes and edges.

Algorithm 1: CFG Generation Input: Ordered list of use-case steps S₁…Sₙ (main + alt + ext) Output: CFG G = (V, E), root v₀

V ← {S₁, S₂, …, Sₙ}
v₀ ← S₁
E ← ∅
For i from 1 to n−1 do 4.1 If S_{i+1} is a conditional step then Add edge (Sᵢ → S_{i+1}) with label “true”, Add edge (Sᵢ → S_{i+2}) with label “false”, Skip next index accordingly. 4.2 Else Add edge (Sᵢ → S_{i+1}).
Return G = (V, E, v₀).

2.3 Post-Processing and Validation After generation, the JSON is parsed and validated by checking: * No isolated nodes (each v ∈ V appears in at least one edge). * All non-root nodes have ≥ 1 incoming edge. * Every edge’s “from” and “to” IDs exist in V. On failure, the tool re-prompts the LLM until a valid CFG is returned.

Path Enumeration Technique

3.1 Depth-First Search (DFS) with Cycle Pruning Starting at root v₀, a recursive DFS collects all simple paths to exit nodes in V_exit or until a node is revisited (to avoid infinite loops). Conditions on edges are recorded as separate path items.

3.2 Pseudocode for Test-Path Extraction (Algorithm 2) Algorithm 2: Test-Path Extraction Input: Verified CFG G = (V, E) Output: Set of paths P, each a list of (node, condition)

procedure DFS(curr, path): if curr ∈ path then Record path in P (cycle entry) and return Append curr to path if curr ∈ V_exit then Record path in P and return for each outgoing edge (curr → nbr) with label cond do DFS(nbr, path ∪ [cond]) Call DFS(v₀, []). Translate each vᵢ to its NL statement; inject conditions as statements.

3.3 Complexity In a DAG with branching factor d and depth h, worst-case path count is O(d^h). DFS thus may be exponential in the number of decision nodes, but practical use cases remain tractable.

Test Case Generation from Paths

4.1 Prompt #2 for Test-Case Creation Prompt #2 guides the LLM to format each path into an abstract test case with: Title, Preconditions, Step 1…n with expected result. It includes one illustrative example.

4.2 Mapping Rules and Heuristics

Title derived from the use-case name plus a branch summary.
Preconditions accumulate any “if” guards encountered.
Steps list each NL statement in the path.
Expected results are inferred from system responses in the statements.

No additional concrete inputs are generated; the result is an abstract test case.

Evaluation and Metrics

5.1 Metrics Definitions Precision, Recall, and F1 for nodes/edges: Precision = TP / (TP + FP) Recall = TP / (TP + FN) F1 = 2·Precision·Recall / (Precision + Recall)

Normalized Graph Edit Distance (nGED): nGED(G₁, G₂) = 1 − GED(G₁, G₂) / (|V₁|+|E₁|+|V₂|+|E₂|)

Discrepancy Rate (DR): DR% = (N_diff / N_UC) * 100%

Avg.|Δ|: Avg.|Δ| = (1/N_UC) Σ |LLM_path_i − GT_path_i|

5.2 Quantitative Results

RQ1 (CFG Accuracy, threshold = 0.75)

Node F1_avg = 0.895; Edge F1_avg = 0.761; nGED_avg = 0.933.

RQ2 (Test-Case Coverage)

LLMCFG-TGen DR = 2.38%; Avg.|Δ| = 0.02.
Baselines: Direct LLM DR=57.0%, Avg.|Δ|=1.12; AGORA DR=33.3%, Avg.|Δ|=0.45.

RQ3 (Practitioner Ratings over 20 UCs, 113 cases) Average Likert (5-point) scores: Relevance: AGORA=4.25, LLMCFG-TGen=4.75 Completeness: AGORA=3.84, LLMCFG-TGen=4.64 Correctness: AGORA=3.74, LLMCFG-TGen=4.51 Clarity: AGORA=3.73, LLMCFG-TGen=4.48

RQ4 (LLM Comparison for CFGs)

| Model | Nodes vs. 304 | Paths vs. 103 | DR% | Avg.|Δ| | Node F1 | Edge F1 | nGED | Time(s) | |---------------------|--------------:|-------------:|------:|----:|--------:|--------:|------:|--------:| | GPT-4o | 308 | 103 | 2.30% |0.02 | 0.895 | 0.761 | 0.933 | 336 | | Gemini 2.5 Flash | 314 | 110 |16.67% |0.17 | 0.862 | 0.683 | 0.912 | 1006 | | LLaMA4 Scout-Inst. | 314 | 106 |11.90% |0.17 | 0.865 | 0.696 | 0.909 | 540 |

Practitioners’ Assessment and Case Studies

6.1 Study Protocol Four senior practitioners compared side-by-side test suites from AGORA and LLMCFG-TGen, blinded to method. Each test case was scored on four 5-point Likert dimensions and qualitative feedback was collected.

6.2 Key Qualitative Findings

LLMCFG-TGen test cases were praised for logical consistency (“Steps follow exactly the use-case flow”), comprehensiveness (“No missing edge cases”), and readability (“Clear titles and preconditions”).
Senior engineers noted occasional verbosity and suggested a concise checklist mode.
All agreed LLMCFG-TGen reduced manual modeling effort substantially.

Discussion and Future Work

7.1 Limitations

Single-use-case processing – no batch handling of related cases.
No built-in test prioritization or ranking.
Generates abstract, not executable, test scripts.

7.2 Future Directions

Extend to batch and inter-use-case CFG generation (include «include»/«extend» relations).
Add path-level priority annotations for test-case ranking.
Integrate concrete data generation and script templates for end-to-end executable tests.
Incorporate human-in-the-loop refinement and multimodal inputs (diagrams, images).

Appendix: Example CFG in TikZ (Sample)

\begin{verbatim} \begin{tikzpicture}[->, node distance=1.5cm] \nodestart {Start: User opens menu}; \nodedecision, below of=n1 {Is user logged in?}; \nodeaction, below left of=n2 {Show login screen}; \nodeaction, below right of=n2 {Display dashboard}; \nodeterminal, below of=n4 {End}; \path (n1) edge (n2) (n2) edge node[left]{false} (n3) (n2) edge node[right]{true} (n4) (n3) edge (n2) % alternative: user logs in then back (n4) edge (n5); \end{tikzpicture} \end{verbatim}

This report synthesizes the method and its comprehensive evaluation, demonstrating that LLMCFG-TGen effectively bridges NL requirements and systematic test-case generation.

PDF Markdown Chat (Pro)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Probability-Weighed Control Laws.

Probability-Weighed Control Laws

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics