VeriPB Proof Checking Toolchain
- VeriPB is a double-chain verification toolchain that integrates independent primary analysis and secondary checking to ensure trustworthiness in safety-critical systems.
- It rigorously validates B-method models and pseudo-Boolean proofs by rederiving each inference through separate toolchains like ProB and pyB.
- The toolchain scales to advanced applications such as symmetry breaking and multi-objective MaxSAT, optimizing performance in complex combinatorial proofs.
The VeriPB proof checking toolchain constitutes an independent, double-chain framework for certifying the correctness of combinatorial reasoning, especially for high-integrity model checking, pseudo-Boolean (PB) proof logging, and advanced SAT-based optimization. By integrating a formally distinct secondary checker (pyB) alongside primary analyzers (such as ProB or symmetry-breaking SAT solvers), VeriPB implements the stringent double-chain principle mandated by safety-critical standards like EN 50128 (Class T3), and provides a scalable infrastructure for proof auditing across a range of domains including B-method models and multi-objective MaxSAT (Witulski et al., 2014, Jabs et al., 29 Jan 2025, Anders et al., 20 Nov 2025).
1. Double-Chain Architecture and Workflow
The VeriPB toolchain is architected around two fully independent components, with all formal models and results independently processed and mutually checked at multiple levels:
- Primary Analysis: A domain-specific tool (e.g., ProB for B, satsuma for SAT, Scuttle for MaxSAT) parses, type-checks, and computes properties or solutions over input models, outputting traces or proof scripts.
- Secondary Verification (pyB / VeriPBChecker): An independently implemented Python-based interpreter (pyB) or generic PB checker (VeriPBChecker) parses the same models and results, performing type assignment (via Hindley–Milner unification in B) and big-step semantic evaluation, or replaying proof scripts for PB reasoning.
- Result Comparison: If disagreement occurs between primary and secondary outputs, an alarm is raised; otherwise, solutions are accepted as “double-checked.”
This approach underpins trustworthiness for safety-critical deployments, as the independent toolchains drastically reduce the likelihood of coincident errors (Witulski et al., 2014).
2. Formal Underpinnings: Syntax, Typing, and Semantics
VeriPB’s correctness guarantees derive from a clear delineation of the formal systems being checked:
- B-Method Models: Input specifications comprise B predicates and expressions, with pyB reconstructing types using Hindley–Milner unification and evaluating semantics using big-step evaluation on reconstituted environments. Quantifiers are checked either by explicit enumeration or by offloading constraints to external solvers if the domain is large or infinite (Witulski et al., 2014).
- Pseudo-Boolean Proofs: For PB problems, the checker’s core is a replay engine that validates each step (axiom, linear combination, division, strengthening, redundancy elimination, and solution logging) using a normalized internal representation of all constraints. Preorders for advanced applications (e.g., symmetry breaking or multi-objective optimization) are encoded directly in the loaded problem order (Jabs et al., 29 Jan 2025, Anders et al., 20 Nov 2025).
Formally, a result is only “double-checked” if every inference or state produced by the primary chain is independently re-derived or verified by the secondary program under fresh, isolated semantics (e.g., Prolog/Java vs. Python).
3. Methodologies for Proof Generation and Auditing
B-Method Double-Checking
The canonical VeriPB chain for B machines proceeds as:
- Model Publication: Author supplies a .mch file.
- Primary Run: ProB computes states, witnesses, or trajectories, exporting variable assignments and traces in text/JSON dumps.
- Secondary Verification: pyB re-parses the model, checks types, reconstructs state environments, and re-evaluates all predicates and substitutions.
- Comparison: Disagreement in predicate or invariant evaluation triggers a verification alarm.
This guarantees that only if all states, properties, and solutions are accepted by both chains, the result is trusted for deployment (Witulski et al., 2014).
Pseudo-Boolean Proof Logging
For PB problems—including advanced symmetry breaking and MO-MaxSAT—VeriPB’s workflow engages an explicit proof logger within the solver itself, emitting a .veripb proof script containing every significant transformation (axioms, cuts, solution loggings, etc.). The independent checker then validates each proof line, maintaining normalized constraint maps and order witness checks at every inference (Jabs et al., 29 Jan 2025, Anders et al., 20 Nov 2025).
4. Advanced Applications: Symmetry Breaking and Pareto-Optimality
Certified Symmetry Breaking
Certifying symmetry-breaking in SAT and PB often requires justifying lexicographic constraints of the form:
Encoding these using big integers is prohibitively expensive for large . The auxiliary-variable method in VeriPB introduces new Boolean variables to encode the lexicographic order through reified constraints, yielding orders-of-magnitude improvements in proof size and checking speed. This approach avoids the blow-up of big-integer encodings and the corresponding bottlenecks in proof auditability (Anders et al., 20 Nov 2025).
Pareto-Optimality in Multi-Objective MaxSAT
VeriPB supports proof logging for MO-MaxSAT despite not natively supporting multi-objective semantics, by leveraging custom “order” preorders. The solver (Scuttle) emits proof logs wherein every transformation and Pareto-dominance preserving inference is recorded. The checker validates, for each redundancy (RBC) rule and solution-logging step, that every minimal point in the non-dominated set is correctly represented and excluded only by valid decision steps—never inventing non-existent minimal points (Jabs et al., 29 Jan 2025).
An outline of a proof fragment (for bi-objective MaxSAT):
1 2 3 4 5 6 7 8 |
load_order x1 x2 ... x5 u1 v1 u2 v2
...
1 redundance C1 witness={...}
2 divide 1 11
...
5 redundance C3 witness={...}
6 log_solution x1 ¬x2 x3 ¬x4 ¬x5
... |
The checker enforces, via its order-preorder encodings, that all inferential steps respect the modeled Pareto relations.
5. Integration, Tooling, and Performance
Command-Line Interfaces
Both the B-method and proof-logging flows are fully automatable via CLI scripts:
probcliandpybclifor animation, model exploration, and dump verification in B.- In-solver proof loggers for PB and MaxSAT; proof scripts fed directly into VeriPBChecker for replay.
Typical flags include model input, proof dump outputs, external solver specification, enumeration bounding, and execution timeouts (Witulski et al., 2014).
Empirical Performance
Industrial case studies demonstrate that pyB can recheck 32–42 B states per machine in ≈5 minutes per machine; parsing and retyping are ≈2s per 50 states, with predicate evaluation scaling from <10ms on small sets to ≈7s for power sets of size 20 (Witulski et al., 2014). In MO-MaxSAT, proof-logging incurs a 14–29% overhead on the solver. VeriPBChecker is 20–50× slower than solver runtime, with larger proof scripts (more PD-cuts) leading to longer checks. No proof-logging instance required more than 1h; proof checking could take up to 10h on largest benchmarks (Jabs et al., 29 Jan 2025).
| Aspect | B-Method (pyB) | PB/MaxSAT (VeriPBChecker) |
|---|---|---|
| Primary tool | ProB | Scuttle, satsuma |
| Secondary chain | pyB (Python) | VeriPBChecker (PB format) |
| Typical overhead | ≈2s per 50 states | 14–29% (logging); ×20–50 (check) |
| Scaling limits | Set enumeration, external | Proof size, PD-cuts, core boosts |
6. Limitations and Future Enhancements
Identified constraints include pyB’s performance on large set enumerations, lack of support for certain external functions, and brute-force quantifier handling on infinite domains. For PB reasoning, proof size may balloon when symmetry touches many variables or many PD-cuts are needed. Planned advancements encompass:
- Integration of SMT-based constraint-solvers (e.g. Z3) for quantifier discharge in pyB.
- Migration of pyB to RPython + PyPy JIT for improved performance.
- Full CLI integration of verification steps into interactive modeling environments.
- Automatic regression-test generation for rare-case evaluation paths.
- For PB/MaxSAT, proof-format extensions are deliberately avoided; all multi-objective adaptations are encoded using only one loaded preorder (Witulski et al., 2014, Jabs et al., 29 Jan 2025).
A plausible implication is that further improvements in symbolic reasoning and proof compression will be necessary to enable routine checking of extremely large combinatorial proofs without prohibitive memory or time costs.
7. Impact and Adoption
VeriPB provides a practical, formally motivated foundation for tool qualification at stringent integrity levels in industrial and research settings. Its double-chain principle, independence of implementations, and compatibility with standard PB proof logging and a variety of model classes make it suitable for both conventional model checking (through B-method) and cutting-edge optimizing SAT/MaxSAT workflows with certified Pareto-optimality and symmetry breaking (Witulski et al., 2014, Jabs et al., 29 Jan 2025, Anders et al., 20 Nov 2025). Early experiences confirm that pyB’s independent re-evaluation can detect and isolate faults missed by the primary chain, and continuous proof-generation and checking pipelines improve assurance of correctness across the combinatorial reasoning spectrum.