Heaviside Composite Optimization

Updated 24 January 2026

Heaviside Composite Optimization Problem is a framework that uses Heaviside step functions to enforce binary constraints and logical operations.
It reformulates discontinuous composite mappings as mixed-integer programs, leveraging Progressive Integer Programming for scalable, high-dimensional optimization.
HSCOPs apply to constrained learning, optimal control, and topology optimization while offering robust statistical and computational guarantees.

A Heaviside Composite Optimization Problem (HSCOP) is an explicit optimization framework in which discontinuous decision boundaries and logical operations are handled via composite mappings involving the Heaviside (step) function applied to continuous, piecewise-differentiable or affine argument functions. This framework enables rigorous treatment of problems governed by binary constraints, selection rules, or abrupt phase transitions, encompassing applications in constrained learning, optimal control, topology optimization, and robust combinatorial inference. The mathematical structure of HSCOP unifies an increasingly broad class of modern optimization models, combining elements of nonsmooth/combinatorial analysis with the scalability and statistical guarantees of continuous variational formulations.

1. Formal Definition and General Structure

The canonical HSCOP is given by the optimization: $\max_{x\in X_{\rm AHC}} \ \theta(x) + f_0(x)$

$\text{subject to} \quad f_i(x) \ge b_i, \quad i=1,\dots,m, \quad x\in P$

where:

$\theta:\mathbb R^n \to \mathbb R$ is a (locally) differentiable base function,
$f_i(x) = \sum_{j=1}^{J_i} \psi_{ij} H_{[0,\infty)}(\phi_{ij}(x))$ with each $\psi_{ij}\ge 0$ and $\phi_{ij}:P\to\mathbb R$ ,
$H_{[0, \infty)}(t)$ denotes the Heaviside step function (indicator of $[0,\infty)$ ),
$P$ is a polyhedral feasible set, and
$X_{\rm AHC}$ encodes the admissible locus determined by Heaviside constraints: $X_{\rm AHC} = \{ x \in P : f_i(x) \ge b_i,\ i=1,\dots,m \} \}$ This formulation captures a sum of weighted indicator or thresholding functions composed with generally piecewise-affine or nonlinear argument maps, thereby enforcing logical constraints or activating latent terms only on subsets specified by thresholds in the base variables (Fang et al., 2024).

2. Heaviside Composite Mappings and Modeling Principles

At the core of HSCOPs is the use of Heaviside composite mappings of the form $H(\phi(x))$ , which encode decisions contingent on the sign of a continuous function $\phi$ . Models frequently employ sums or products of such terms to impose “hard” regime-switching in the objective or constraints, as in:

Selection rules: $\{g(x) = j\}$ encoded via products of Heavisides over pairwise score differences in classification/treatment learning (Liu et al., 17 Jan 2026),
Clipping or truncation: $\mathbf{1}\{C_s(g) \le \tau\}$ for weight truncation in policy learning,
Resource activation: On/off conditions for element inclusion, feature selection, or control action initiation,
Phase separation: Material or topology status in density optimization.

The Heaviside composition creates discontinuities—and thus integer-valued, combinatorial effects—in otherwise continuous-variable problems. Modeling with HSCOP thus enables the explicit representation of domains or objectives determined by discrete switching events.

3. Reformulation and Mixed-Integer Programming Approaches

Every composite Heaviside function $H(\phi_{ij}(x))$ admits an equivalent discrete optimization reformulation by introducing binary variables $z_{ij} \in \{0,1\}$ , with logical constraints such as: $z_{ij} = H_{[0, \infty)}(\phi_{ij}(x)) \iff \phi_{ij}(x) \ge \underline B (1 - z_{ij}),$ for some suitable lower bound $\underline B$ . The full (mixed-integer) problem is then: $\max_{x, z} \quad \theta(x) + \sum_{j=1}^{J_0} \psi_{0j} z_{0j}$

$\text{subject to}\quad \sum_{j=1}^{J_i} \psi_{ij} z_{ij} \ge b_i, \forall\, i$

$\phi_{ij}(x) \ge \underline B (1 - z_{ij}),\quad z_{ij}\in\{0,1\},\ x\in P$

The discrete formulation is amenable to modern mixed-integer (linear, convex, or conic) programming methods. However, scalability often becomes an issue when the number of binaries (i.e., Heaviside activations) is large, as in high-dimensional learning or policy optimization (Fang et al., 2024).

To address this, the Progressive Integer Programming (PIP) method has been developed. PIP iteratively solves a sequence of reduced-size integer programs in which only uncertain (near-threshold) Heaviside terms are formulated as binaries, while the rest are fixed by the sign of their argument. This yields high-quality solutions to HSCOPs with thousands of composite terms, as empirically demonstrated in large-scale constrained learning experiments (Fang et al., 2024, Liu et al., 17 Jan 2026).

4. Theoretical Properties: Stationarity, Local and Global Optima

The discontinuous structure of HSCOP necessitates novel optimality theory:

Epi-stationarity: A point $\bar x$ is epi-stationary for HSCOP iff the pair $(\bar x, f_0(\bar x))$ is Bouligand-stationary for the lifted problem $\max_{x,t} \theta(x)+t$ subject to $x\in X_{\rm AHC}, t\le f_0(x)$ .
Local optimality: When $\theta$ is concave and all $\phi_{ij}$ piecewise-affine, epi-stationarity is both necessary and sufficient for local optimality (Han–Cui–Pang, 2023).
Fixed-point and reduced MIP property: For any feasible $\bar x$ , there exists a tolerance region (interval about the thresholds used in PIP) such that if the reduced MIP (fixing definite Heaviside signs) achieves global optimality with respect to $\bar x$ , then $\bar x$ is locally optimal for the full HSCOP (Fang et al., 2024). This structure underpins both the correctness of the PIP method and the tractability of large-scale composite optimization under discontinuity.

5. Algorithmic and Computational Aspects

The PIP algorithm is central for practical HSCOPs. It iteratively updates the set of free binaries based on the magnitude of the argument functions at the current solution, defines index sets of 'in-between', 'active', and 'inactive' composite terms, and adapts its focus area as the solution progresses. Each reduced problem is solved by a standard MIP solver, but with the key advantage that the number of binaries is kept manageable by explicitly leveraging the structure of the composite activation regions.

Empirical evaluations indicate that for problem sizes with 100–500 binaries, PIP achieves near-optimal objective values in tens to hundreds of seconds, often outperforming full MIP which can stagnate without feasible solutions in the allotted time limits. For even larger scale instances ( $>1000$ binaries), PIP remains tractable and quality of obtained solutions remains within 1–2% of the best-known feasible values, and in some cases, strictly better (Fang et al., 2024, Liu et al., 17 Jan 2026).

6. Key Applications and Representative Domains

HSCOPs have been deployed across several domains:

Application Area	Mechanism of Heaviside Use	Reference
Offline policy learning**	Policy selection, weight clipping, composite policy constraints	(Liu et al., 17 Jan 2026)
Multi-class treatment & rule learning	Rule-dependent constraints, logical activations	(Fang et al., 2024)
Density-based topology optimization	Phase separation, topology (solid/void) status	(Behrou et al., 2020, Kumar, 2021, Murea et al., 2019)
Time-optimal control	Terminal event as Heaviside on state variable	(Pfeiffer et al., 2023)
Structural compliance/buckling	Discrete element “removal” via density threshold	(Behrou et al., 2020)
Set-constrained regression and classification	0-1 logical status via direct Heaviside constraints	(Zhou et al., 2020)

In each, the Heaviside composite structure encodes selection, regime switching, or logical operations that would otherwise require complex or heuristic constraint management.

7. Statistical and Computational Guarantees

In offline policy optimization with weight clipping, explicit high-probability regret bounds for HSCOP-based learning demonstrate that properly-tuned Heaviside composite objective functions yield strictly improved minimax rates over classical estimators, particularly in the weak-overlap regime where propensity scores are small. The sample size scaling, polynomial in the covariate dimension and logarithmic in the problem size, is provably optimal up to constant factors, and can be directly attributed to the MSE-minimizing thresholding enabled by the composite framework (Liu et al., 17 Jan 2026).

A plausible implication is that HSCOPs provide an optimal computational-statistical interface: they combine tractable solution via progressive MIP with non-asymptotic statistical guarantees under minimal model assumptions, whenever the discontinuity or selection structure is expressible via composite Heaviside architectures.

8. Future Directions and Open Challenges

Active research targets expanding PIP and related methods to broader nonsmooth compositions (general indicator/combinatorial functions beyond Heaviside), further scaling up to millions of binaries via distributed and parallel solvers, and exploring direct smooth approximations or surrogate relaxations with theoretical performance comparable to the discrete composite approach.

The generality of the Heaviside composite framework—unifying combinatorial and continuous paradigms—suggests that advances in scalable exact or approximate methods, together with deeper optimality characterizations, could further extend its application reach to robust control, logic-constrained statistical estimation, and mixed logical-dynamical modeling in engineering and machine learning.

References:

(Behrou et al., 2020, Kumar, 2021, Zhou et al., 2020, Fang et al., 2024, Murea et al., 2019, Pfeiffer et al., 2023, Beznosikov et al., 2021, Liu et al., 17 Jan 2026)