Condition-Based Partitioning

Updated 4 July 2026

Condition-Based Partitioning is a design pattern that decomposes a global object into regions using dynamic, problem-specific criteria rather than fixed intervals.
It supports diverse applications—from global optimization and SMT solving to runtime verification and distributed systems—by tailoring conditions to the inherent structure of the problem.
Its algorithmic realizations employ recursive splitting and adaptive triggers, backed by formal guarantees that balance computational efficiency with correctness.

Condition-based partitioning denotes a family of methods in which a search space, dynamical network, logical formula, runtime model, dataset, or physical phase space is decomposed according to explicit conditions rather than by a fixed uniform partition. In the cited literature, those conditions take the form of Lipschitz-gradient lower bounds in global optimization, mutually exclusive and covering formulas in SMT, policy-availability and parameter-sensitivity criteria in runtime verification, interaction-strength criteria in non-centralized control, winner-take-all loss assignments in modular learning, skew-triggered repartitioning in distributed data systems, and constrained-equilibrium relations in phase transformation modeling (Kvasov et al., 2013, Wilson et al., 2023, Dastranj et al., 2021, Riccardi et al., 28 Feb 2025, Tacke et al., 2024, Zvara et al., 2021, Amos et al., 2018).

1. Conceptual scope

Across the cited literatures, the object being partitioned varies, but the underlying abstraction is stable: a global object is divided into regions or components, and the admissibility of a division is determined by conditions that encode either correctness, efficiency, or physical feasibility. In smooth global optimization, the partitioned objects are hyperintervals of a search domain, and the governing condition is the lower bound induced by a Lipschitz gradient model (Kvasov et al., 2013). In runtime verification of parametric Markov decision processes, the partitioned objects are independent components such as SCCs, and the governing conditions are policy availability and the predicted effect of parameter changes on re-verification cost (Dastranj et al., 2021). In distributed SMT solving, the partitioned objects are subproblems of the form $F \wedge c_i$ , where the conditions $c_i$ must satisfy mutual exclusivity and coverage (Wilson et al., 2023).

In control and systems papers, condition-based partitioning is tied to structural couplings. A network is first decomposed into fundamental system units and then aggregated into composite system units according to edge existence, coupling magnitude, and a global partition index that trades off intra- and inter-CSU interactions together with a granularity penalty (Riccardi et al., 28 Feb 2025). In streaming and analytics systems, the partitioning function is updated from runtime conditions such as key skew, recurrence, and UDF-extracted predicates, so the partition itself becomes a control variable of the execution engine (Zvara et al., 2021, Zou et al., 2020). In modular learning, the discovered partitions are subsets of samples won by competing predictors, and the condition is smallest per-sample loss under a winner-take-all assignment (Tacke et al., 2024).

A compact comparison is useful because the same label covers materially different mathematical objects.

Domain	Partitioned object	Governing condition
Global optimization	Hyperintervals $D_i$	Nondominance under $R_i(\tilde K)$
Runtime verification	Components $C(\pi)$	Minimum $Bal + 10 \cdot Var$
SMT solving	Subproblems $F \wedge c_i$	Mutual exclusivity and coverage
Non-centralized control	CSUs built from FSUs	Interaction weights and $p^{\mathrm{idx}}$
Streaming and analytics	Key partitions or stored datasets	Skew, recurrence, or UDF predicates
Modular learning	Sample subsets $S_k$	Minimum loss per sample
Phase transformation	Partitioning endpoints	CCE constraints

This heterogeneity means that condition-based partitioning is better understood as a design pattern than as a single algorithm. A plausible implication is that comparisons across domains are most informative when made at the level of partition criteria, guarantees, and update mechanisms rather than at the level of implementation detail.

2. Formal constructions

In the optimization formulation of Strongin, Sergeyev, and Kvasov, the problem is

$f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$

with gradient satisfying

$c_i$ 0

For a hyperinterval $c_i$ 1, the method constructs a quadratic minorant

$c_i$ 2

and the characteristic

$c_i$ 3

where $c_i$ 4 is the minimum of the linearization over $c_i$ 5. Hyperintervals that are best for some $c_i$ 6 are precisely those on the lower-right convex hull of the points $c_i$ 7 with $c_i$ 8 (Kvasov et al., 2013). The partitioning condition is therefore geometric: an interval is selected if it is nondominated under the family of quadratic lower models.

In SMT, the formalism is logical rather than geometric. A condition-based partitioning strategy constructs conditions $c_i$ 9 for a formula $D_i$ 0 such that

$D_i$ 1

The induced subproblems are $D_i$ 2, and satisfiability is preserved through

$D_i$ 3

The paper studies both cube partitions, where each $D_i$ 4 is a conjunction of literals, and dynamic disjoint non-cube scattering, where each condition strengthens a fresh cube with negations of all previously used cubes (Wilson et al., 2023).

In runtime verification of autonomous systems, the formal object is a pMDP

$D_i$ 5

partitioned into independent components $D_i$ 6. Policy-conditioned pruning yields successive reduced state-transition structures, and the best available partitioning policy is selected by minimizing

$D_i$ 7

The Balancing metric measures component-size heterogeneity, while Variation measures how many components are affected by parameter changes in the worst-case scenario (Dastranj et al., 2021).

In non-centralized control, the partitioned object is the equivalent graph of a dynamical system. The framework defines composite system units as subsystems whose inputs affect only their own states, and it scores a partition $D_i$ 8 by a partition index

$D_i$ 9

Two instantiations are given: a ratio-type metric inspired by modularity and an optimization-based quadratic metric admitting an IQP formulation (Riccardi et al., 28 Feb 2025).

A distinct but structurally related formalization appears in phase-field modeling of quenching and partitioning. There, the “condition-based” label refers to constrained-carbon equilibrium rather than computational decomposition. The endpoint compositions are determined by equal carbon chemical potential, iron conservation across a stationary $R_i(\tilde K)$ 0 interface, global carbon mass balance, and phase-fraction closure, rather than by unconstrained equilibrium tie-lines (Amos et al., 2018). The partitioning condition is therefore thermodynamic.

3. Selection criteria, scores, and triggers

The central technical distinction across these methods is the form of the condition that decides whether a region should be split, preserved, revisited, or reused.

In Lipschitz-gradient optimization, a nondominated hyperinterval must also satisfy the improvement condition

$R_i(\tilde K)$ 1

with a typical choice $R_i(\tilde K)$ 2 (Kvasov et al., 2013). This criterion suppresses splits whose lower model cannot improve the incumbent by a meaningful margin. The same paper makes the condition multi-scale by considering all $R_i(\tilde K)$ 3 rather than a single estimate.

In pMDP verification, selection is policy-based and metric-based. Lemma 1 states that “The additive value of Balancing and Variation determines the best partitioning policy.” Balancing is minimized when component sizes are uniform, with minimum value $R_i(\tilde K)$ 4, while Variation ranges from $R_i(\tilde K)$ 5 to $R_i(\tilde K)$ 6 and is scaled by $R_i(\tilde K)$ 7 because the two metrics operate on different numerical ranges (Dastranj et al., 2021). The runtime trigger is equally explicit: only partitions affected by updated parameter valuations $R_i(\tilde K)$ 8 are re-approximated and re-verified.

In control-oriented partitioning, grouping and separation are driven by interaction conditions on the equivalent graph. FSUs must be merged when a state is directly connected to multiple inputs, and candidate CSUs are favored when intra-CSU weights dominate and frontier interactions remain weak. The granularity parameter $R_i(\tilde K)$ 9 determines the size regime: large $C(\pi)$ 0 yields individual FSUs, whereas small $C(\pi)$ 1 yields full aggregation into one CSU (Riccardi et al., 28 Feb 2025). This makes the condition both structural and resource-sensitive.

In active partitioning for supervised learning, the selection criterion is the per-sample loss. For model $C(\pi)$ 2 and sample $C(\pi)$ 3, the winner is

$C(\pi)$ 4

with hard assignments

$C(\pi)$ 5

The resulting modular objective is

$C(\pi)$ 6

The paper also gives a soft competition variant with temperature $C(\pi)$ 7 (Tacke et al., 2024). The condition is therefore endogenous: specialization changes the future partition.

In streaming data systems, the trigger is load imbalance. The paper defines

$C(\pi)$ 8

and uses

$C(\pi)$ 9

to constrain acceptable partition load under KIP (Zvara et al., 2021). In persistent analytics, the trigger is prospective reuse: partitioners are selected from UDF-derived subcomputations and ranked by frequency, recency, distance, complexity, selectivity, key distribution, and co-partitioning opportunities (Zou et al., 2020).

A statistical analogue appears in partition-wise regression and classification, where change points and local models are selected by a two-part MDL criterion rather than by runtime triggers. The partition is chosen jointly with submodels, and the resulting estimator is strongly consistent for break locations under the stated assumptions; in regression, both the number of change points and their locations are strongly consistent when the relevant predictor set is known (Cheung et al., 2016). This suggests a broader view in which “condition” may refer either to an explicit runtime signal or to an information criterion governing offline partition recovery.

4. Algorithmic realizations

Although the triggering conditions differ, the algorithmic realizations show recurrent motifs: local evaluation, recursive splitting, reuse of prior computations, and selective refinement.

The one-point-based scheme for global optimization evaluates $Bal + 10 \cdot Var$ 0 and $Bal + 10 \cdot Var$ 1 at only one vertex of each hyperinterval, typically $Bal + 10 \cdot Var$ 2, and splits the selected hyperinterval along its longest edge. For $Bal + 10 \cdot Var$ 3, the points

$Bal + 10 \cdot Var$ 4

define three equal-volume subintervals. A vertex database stores all evaluated vertices because the scheme reuses vertices across up to $Bal + 10 \cdot Var$ 5 adjacent hyperintervals, thereby reducing the number of function and gradient evaluations (Kvasov et al., 2013).

In SMT, the core realization is solver-internal generation of split conditions from CDCL(T) state. Candidate atoms may be drawn from the SAT activity heap, the decision trail, or theory conflict clauses. With a target of $Bal + 10 \cdot Var$ 6 partitions, cube partitioning uses $Bal + 10 \cdot Var$ 7 atoms and emits all $Bal + 10 \cdot Var$ 8 cubes, whereas scattering emits disjoint non-cube conditions iteratively and adds a blocking lemma $Bal + 10 \cdot Var$ 9 after each emission so that the partitioning solver does not revisit the explored region (Wilson et al., 2023). The distinction is between a static complete enumeration and a dynamic refinement process.

In the control framework, FSU construction is itself condition-based. Root FSUs are created from input-to-state edges, forward assignment attaches unassigned states according to strongest forward coupling from an FSU root, and backward assignment attaches any remaining states according to strongest backward coupling toward an existing FSU. CSU aggregation then proceeds either by a greedy algorithm that maximizes the immediate increase in the ratio-type partition index or by an IQP minimizing

$F \wedge c_i$ 0

subject to non-overlapping assignments (Riccardi et al., 28 Feb 2025).

In runtime systems, the algorithmic emphasis is on low-overhead updates. The DR module maintains distributed top- $F \wedge c_i$ 1 histograms, merges them in the master, computes a new KIP mapping, and installs it at micro-batch or checkpoint boundaries. Heavy keys are first placed by minimal-change preference, then by their hash home, else by least-loaded partition; the residual load is balanced by weighted hashing through virtual hosts (Zvara et al., 2021). Lachesis applies an analogous logic to storage-time partitioning: it compiles UDF-centric workloads into analyzable IR DAGs, extracts two-terminal subgraphs that compute partition-relevant keys, and uses an A3C policy to select a persistent partitioner for future reuse (Zou et al., 2020).

In modular learning, the algorithm is an alternation between assignment and expert update. All models predict all samples, each sample is assigned to the current winner, each model trains only on its won set, and optional add/drop operations adjust the model pool according to high-loss regions or replaceability ratios (Tacke et al., 2024). In verification of neural contraction, the same alternation appears in another form: a region is verified if the dominant eigenvalue of its symmetric Metzler majorant is nonpositive; otherwise the region is partitioned and rechecked, and this per-cell spectral condition is incorporated into the training loss of the controller and contraction metric networks (Davydov, 1 Dec 2025).

5. Guarantees, correctness, and verification properties

The strongest commonality across the literature is that partitioning is not treated as a heuristic alone; it is tied to explicit guarantees.

For the Lipschitz-gradient method, the main convergence statement is everywhere dense sampling: if $F \wedge c_i$ 2, then for any $F \wedge c_i$ 3 and any $F \wedge c_i$ 4, there exists a generated trial point $F \wedge c_i$ 5 with $F \wedge c_i$ 6 (Kvasov et al., 2013). The proof sketch relies on repeated splitting of nondominated large hyperintervals into three equal-volume subintervals, forcing the maximum diagonal length to decrease without bound.

For SMT partitioning, the guarantees are semantic. Because the conditions satisfy mutual exclusivity and coverage, no model is duplicated across subproblems and completeness is preserved: $F \wedge c_i$ 7 This is the basis for parallel speedups without loss of soundness, and it remains valid for both cube and scattering constructions (Wilson et al., 2023).

For partition-wise regression and classification, the guarantees are statistical. If the true number of change points is known, the estimated break locations converge almost surely to the truth for regression, logistic, and probit models. In partition-wise linear regression with Gaussian noise and known relevant predictors, the estimated number of change points and their locations are both strongly consistent (Cheung et al., 2016). In partitioning-based least squares series regression, IMSE-optimal partition size satisfies

$F \wedge c_i$ 8

and robust bias correction yields valid pointwise and uniform inference at IMSE-optimal tuning (Cattaneo et al., 2019).

For policy-based runtime verification, the guarantee is more modest but still formalized: Lemma 1 states that minimizing the additive value of Balancing and Variation determines the best partitioning policy (Dastranj et al., 2021). In control verification, the guarantee is spectral. Over a region $F \wedge c_i$ 9, interval analysis and IBP construct a symmetric Metzler majorant $p^{\mathrm{idx}}$ 0; if

$p^{\mathrm{idx}}$ 1

then the closed-loop contraction inequality holds on the whole region. Adaptive partitioning tightens the bounds until the condition is either certified or the refinement limit is reached (Davydov, 1 Dec 2025).

The phase-field literature uses “condition-based” in a physically different sense, but it also places the method on a constrained foundation. Under stationary interface, no substitutional diffusion, and suppressed carbide formation, the endpoint compositions are determined by the CCE constraints rather than by unconstrained equilibrium. This gives a well-defined partitioning endpoint for carbon redistribution in $p^{\mathrm{idx}}$ 2 microstructures (Amos et al., 2018).

6. Applications, empirical outcomes, and limitations

The empirical literature is heterogeneous, but several papers report large gains when the partitioning condition aligns closely with the governing structure of the problem.

In differentiable global optimization, the Lipschitz-gradient method was tested on GKLS differentiable classes comprising 800 functions in dimensions $p^{\mathrm{idx}}$ 3– $p^{\mathrm{idx}}$ 4. On criterion C1, the maximum number of evaluations for the $p^{\mathrm{idx}}$ 5, hard class was $p^{\mathrm{idx}}$ 6 for the new method, versus $p^{\mathrm{idx}}$ 7 for DIRECTl and more than $p^{\mathrm{idx}}$ 8 for DIRECT with $p^{\mathrm{idx}}$ 9 unsolved problems. For $S_k$ 0, hard, the new method required $S_k$ 1 evaluations, whereas both DIRECTl and DIRECT exceeded $S_k$ 2, with $S_k$ 3 and $S_k$ 4 unsolved problems respectively (Kvasov et al., 2013). On criterion C4 for $S_k$ 5, hard, the win–loss counts were DIRECT $S_k$ 6 versus New $S_k$ 7, and DIRECTl $S_k$ 8 versus New $S_k$ 9.

In runtime verification for energy harvesting systems, the best partitions were associated with fewer, larger, more uniform components. Table 1 includes, for example, $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 0 with $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 1, $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 2, and $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 3, and $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 4 with $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 5, $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 6, $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 7, and $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 8 (Dastranj et al., 2021). The paper states that lower additive $f^* = f(x^*) = \min_{x \in D} f(x), \quad D = [a,b] \subset \mathbb{R}^n,$ 9 correlates with more verification-efficient partitions, although quantitative runtime overhead and number of re-verified partitions at runtime are not reported.

In distributed SMT solving, graduated and hybrid portfolios containing condition-based partitioners outperformed pure portfolios. The best overall configuration was a hybrid multijob strategy, and with $c_i$ 00 cores it improved PAR-2 by $c_i$ 01 relative to a single sequential run (Wilson et al., 2023). The strongest recommended graduated portfolio combined osmt-scatter and decision-cube.

In non-centralized control, the partition index was validated on linear and hybrid DMPC case studies. For a modular linear network with $c_i$ 02 FSUs, the partitions corresponding to $c_i$ 03 yielded $c_i$ 04 CSUs, cumulative stage costs approximately $c_i$ 05– $c_i$ 06, and parallel computation times $c_i$ 07 s. A “bad partition” maximizing inter-CSU coupling led to an estimated $c_i$ 08 hours of parallel time versus less than $c_i$ 09 s for the condition-based partitions (Riccardi et al., 28 Feb 2025). In a random hybrid network with $c_i$ 10 FSUs, optimization-based partitions with $c_i$ 11– $c_i$ 12 CSUs achieved near-CMPC stage cost with moderate parallel time, while fully distributed control was fastest but incurred a $c_i$ 13 stage-cost increase.

In active partitioning, the modular experts produced up to $c_i$ 14 loss reduction on porous-structure stress–strain data, approximately $c_i$ 15 improvement on Energy Efficiency, approximately $c_i$ 16 on Automobile, and approximately $c_i$ 17 on Students’ Portuguese grades (Tacke et al., 2024). The paper reports that gains increase with the number of patterns discovered and that more uniform partition proportions correlate with stronger modular gains.

In data systems, DR+KIP reached speedups of $c_i$ 18– $c_i$ 19 on real workloads and power-law distributions (Zvara et al., 2021). On the LFM stream, KIP improved load imbalance by $c_i$ 20 versus Hash, $c_i$ 21 versus Scan, and $c_i$ 22 versus Readj, while incurring approximately $c_i$ 23 lower relative migration than Readj. In web crawling, the seventh crawl round was reduced from $c_i$ 24 to $c_i$ 25 minutes with DR. Lachesis reported up to $c_i$ 26 speedup for PageRank versus round-robin, $c_i$ 27 and $c_i$ 28 speedups on two-worker and ten-worker Reddit setups, and total TPC-H UDF latency of $c_i$ 29 s and $c_i$ 30 s in two environments, lower than the heuristic and cost-model baselines reported in the same study (Zou et al., 2020).

The limitations are equally recurrent. SMT partitioning can be harmed by poor atom choices, especially HEAP-based ones, and TIME-based triggering is nondeterministic (Wilson et al., 2023). Runtime verification does not report full complexity, memory usage, or sensitivity to the $c_i$ 31 Variation scaling (Dastranj et al., 2021). The control IQP is NP-hard, and algorithmic aggregation can produce partitions with good cost but poor runtime (Riccardi et al., 28 Feb 2025). Active partitioning can lock in early if initialization is poor or if regimes overlap strongly (Tacke et al., 2024). Dynamic repartitioning is less effective under near-uniform distributions or extreme single-heavy-key regimes (Zvara et al., 2021). The contraction-verification framework remains subject to the curse of dimensionality in uniform partitioning and to conservativeness from IBP bounds (Davydov, 1 Dec 2025). In the phase-field setting, CCE-based partitioning assumes stationary interfaces, no substitutional diffusion, no carbide precipitation, and fixed phase fields during partitioning (Amos et al., 2018).

Taken together, these results show that condition-based partitioning is most effective when the partition criterion is tightly matched to the dominant source of structure: curvature in smooth optimization, semantic disjointness in SMT, coupling topology in control, parameter locality in verification, regime specialization in learning, skew and recurrence in distributed systems, or constrained thermodynamics in phase transformation. A plausible implication is that the principal research challenge is not the act of splitting itself, but the design of a condition that is simultaneously informative, computable, and stable under refinement.