StageSAT: Staged FP Satisfiability Solver
- StageSAT is a solver for quantifier-free floating-point formulas that integrates numerical optimization with soundness guarantees, ensuring bit-exact IEEE-754 solutions.
- It uses a three-stage approachāprojection-aided objectives, ULP-scale constraint encoding, and lattice refinementāto progressively align real-valued optimization with true FP semantics.
- Empirical evaluations indicate superior recall, precision, and speed over competing solvers, with no false SAT models across diverse benchmark suites.
StageSAT is a satisfiability solver for quantifier-free floating-point (FP) formulas that integrates numerical optimization with soundness guarantees. Unlike traditional SMT-based solvers, StageSAT conceptualizes floating-point satisfiability as a sequence of real-valued optimization problems with progressively finer alignment to the semantics of IEEE-754 arithmetic, culminating in bit-exact solution validation. The method is fundamentally "black-box": all FP operations are evaluated dynamically, eschewing explicit case analysis, bit-blasting, or SMT encodings. Empirical data demonstrates StageSAT's superior recall, precision, and speed over state-of-the-art numeric and bit-precise alternatives in extensive benchmark competitions (Zhang et al., 8 Jan 2026).
1. Problem Formulation and Three-Stage Optimization Process
StageSAT addresses input formulas in conjunctive normal form (CNF), partitioning constraints into linear equalities ( over ) and a remainder covering non-linear and inequality literals. The search proceeds through three stages:
- Projection-Aided Squared Objective (): Uses the MooreāPenrose pseudoinverse to realize the orthogonal projection onto the affine linear manifold:
The squared Euclidean distance to is . Non-linear and inequality literals are encoded via squared residuals in the clause structure:
Combining, the objective is
This structure leverages exact geometric distance to the solution manifold for the linear part and ensures predictable descent behavior.
- Squared-ULP Objective (): Encodes constraints at the granularity of IEEE-754 unit-in-the-last-place (ULP) steps. For each literal and point ,
The clause distance is the squared product over literals, and the total optimization objective is
if and only if is a model of , making a representing function for the formula.
- -ULP Lattice Refinement (): Starting from the best continuous candidate after Stage 2, the search is refined along the IEEE-754 lattice by introducing integer offsets :
where denotes shifting by ULPs per coordinate. The search is carried out within a bounded region, and guarantees a bit-exact model.
2. Theoretical Properties
StageSAT admits several notable theoretical guarantees for both correctness and descent behavior:
- Partial Monotone Descent: For and satisfying all non-linear and inequality literals (), any strict decrease is equivalent to a strict reduction in Euclidean distance to . If is a descent function, then for the true model set .
- Stall Avoidance via Geometric Projection: Direct use of the exact projection term eliminates optimizer stalling due to zero gradient in tangential directions; any step reducing moves toward the solution manifold.
- Representing-Function Soundness: Both and are representing functions; thus, or is sufficient and necessary for model existence, eliminating the possibility of false SATs.
3. Implementation Details
- Black-Box FP Oracle: Every FP operation (addition, multiplication, transcendental functions) appears as a runtime oracle call. No bit-level decomposition, case analysis, or IEEE-flags are handled explicitly.
- Projection Calculation: The matrix pseudoinverse is computed once per formula, after which each projection is in variable count. For formulas without linear constraints, this stage is bypassed.
- ULP Distance and Lattice Search: ULP computations and -ULP lattice navigation are performed by casting floats to integer bit patterns, incrementing/decrementing, and re-castingāan operation per dimension with exact IEEE-754 semantics.
- Optimization: All stages use established black-box optimizers, such as basinhopping and CRS2, leveraging real-valued objectives without reference to bit-level circuit representations.
4. Comparative Empirical Evaluation
StageSAT has been evaluated on the SMT-COMPā25 QF_FP suites across small, middle, and large benchmark categories, and on previously identified challenging cases from tools such as XSat, Grater, and JFS.
Summary of Empirical Performance
| Metric | StageSAT | Best Prior Numeric Solver | Best Complete SMT Solver |
|---|---|---|---|
| SAT-Recall | 99.4% (345/347) | e.g., XSat: 12 SAT on large suite | 39/49 (MathSAT-Large) |
| False SAT Rate | 0% | Nonzero for prior tools | 0% |
| Median Speedup | 5ā10Ć | ā | ā |
| Max Speedup | 20Ć | ā | ā |
| Coverage (large) | 45/49 | ā¤60% (goSAT, Grater, JFS) | 39/49 |
StageSAT solved strictly more formulas under identical time budgets than any competing solver, with no spurious SAT models. It outperforms XSat (24 SAT vs. 12 on large files), and surpasses goSAT, Grater, and JFS in both recall (92.3% vs. ā¤60%) and soundness (0% spurious SATs). Ablation confirms all three stages are critical to optimum performance and reliability.
5. Methodological Distinctions
Distinct from traditional SMT solvers, StageSAT does not construct bit-blasted or explicit Boolean encodings. Instead, it operates entirely through floating-point evaluations and geometric objectives. This permits direct handling of complex arithmetic expressions (including black-box transcendental functions) and mixed-precision tasks, without specialized abstractions or loss of theoretical soundness.
Furthermore, the staged optimization approachācombining geometric manifold pursuit, ULP-scale minimization, and discrete lattice refinementācircumvents well-known pitfalls of non-monotonic objective landscapes in floating-point domains. The explicit use of representing functions ensures that each reported SAT model corresponds to a bit-exact IEEE-754 assignment; unsatisfiability is only guessed if exhaustive lattice refinement fails to close the gap around the best continuous candidate.
6. Benchmarks, Limitations, and Future Work
StageSAT's evaluation encompasses both standard competition benchmarks and challenging pathological instances that historically stymied numeric and SMT methods. Its design is empirically validated for large and nonlinear benchmarks as well as mixed-precision scenarios.
A plausible implication is that further extension of the staged paradigm, potentially adapting the lattice search to broader classes of discretized theories or integrating symbolic techniques for quantified or higher-order constraints, may extend StageSAT's applicability. Current limitations include reliance on black-box optimizers' ability to explore nonconvex spaces and handling only quantifier-free formulas with real-valued and floating-point atoms.
StageSAT demonstrates that staged optimization, grounded in geometric and IEEE-754-aligned objectives, can achieve both scalability and correctness previously unattainable for FP-satisfiability solvers (Zhang et al., 8 Jan 2026).