Supervised Iterative Computation (SIC)
- Supervised Iterative Computation (SIC) is a framework that alternates between unconstrained regression and explicit convex projection to enforce constraints such as fairness or physical laws.
- It decouples model training from constraint enforcement, allowing any regression model to be used while ensuring convergence via a contraction mapping under mild assumptions.
- Empirical evaluations demonstrate SIC’s improved stability, performance, and constraint satisfaction compared to standard regression and Moving Targets methods on benchmark datasets.
Supervised Iterative Computation (SIC) is an algorithmic framework for supervised learning under constraints, specifically targeted at regression tasks where the predicted outputs must satisfy arbitrary convex constraints, such as fairness, physics, or structural requirements. SIC formulates learning as an alternating sequence of unconstrained regression and explicit constraint enforcement by target adjustment and projection, and provides a convergence guarantee via contraction mapping provided certain mild assumptions hold. This decoupled approach allows the use of any off-the-shelf regression model and enables general handling of convex constraint sets.
1. Formal Mathematical Structure
Let denote a set of input samples, the ideal target outputs, and a parametric regression model with parameters . The SIC framework imposes a closed, convex feasible set on the model outputs, encoding required constraints (e.g., fairness, structural properties), and is equipped with a loss function , such as mean squared error (MSE) or mean absolute error (MAE).
Denote by the set of outputs achievable by the model. SIC operates via two fundamental operators:
- Constraint projection: , projecting onto .
- Model projection: , unconstrained retraining to track a given target .
An affine extension operator , , , blends the ideal and current model outputs.
The base SIC iteration is as follows, given the current model prediction :
- If (infeasible),
- Construct the affine-extended target .
- Compute .
- If (feasible),
- Compute subject to .
Then update by solving the unconstrained regression problem .
The overall one-step update operator is
Initialization is via an unconstrained fit: (C. et al., 2022).
2. Algorithm and Convergence
The SIC framework defines a strict contraction mapping provided the following conditions hold:
- and are closed convex subsets of .
- Each projection operator ( or ) is Lipschitz continuous with constant under the considered norm.
For ,
Thus, if , is a contraction mapping. Invoking the Banach fixed point theorem, SIC has a unique fixed point , and the iterates converge linearly with
For MSE loss, , so convergence holds for any ; for MAE (), and is required (C. et al., 2022).
This contraction property is central, ensuring that iterates approach a unique fixed point regardless of initialization, provided the normed operator is contractive.
3. Practical Implementation
A typical iteration scheme is summarized in the following pseudocode, as appearing verbatim in the formal exposition:
Input: X,y,C,α,β, max_iters
// 1) Initial unconstrained fit
ŷ[1] ← argmin_{ŷ∈B} L(ŷ, y)
for i in 1..max_iters–1:
if ŷ[i] ∉ C:
// infeasible adjustment
yα ← (1−α)·y + α·ŷ[i]
z ← argmin_{z∈C} L(z, yα)
else:
// feasible adjustment
z ← argmin_{z∈C} L(z, y) subject to L(z, ŷ[i]) ≤ β
end
// unconstrained retraining
ŷ[i+1] ← argmin_{ŷ∈B} L(ŷ, z)
end
return ŷ[max_iters]
The approach decouples the constraint-enforcement logic from the machine learning model, so any off-the-shelf regressor can be plugged into the iteration. Each iteration nonetheless requires solving a (convex) projection and retraining.
4. Empirical Evaluation and Benchmarking
SIC's practical impact has been demonstrated on regression tasks with fairness constraints. In these experiments, the constraint set was specified by the Disparate Impact Discrimination Index (DIDI): Benchmarks were conducted on three UCI-style datasets:
- Student (, target=final grade, protected=sex)
- Crime (, target=crime rate, protected=race)
- BlackFriday (, target=purchase amount, protected=gender)
Performance was compared to unconstrained regression () and the Moving Targets algorithm of Detassis et al. Key metrics were the -score on held-out folds, constraint satisfaction , and variability across cross-validation.
Abridged results (mean ± std over 5 folds, after 30 SIC iterations, for MAE loss, ):
| Algorithm | Crime | Crime | Student | Student | BF | BF |
|---|---|---|---|---|---|---|
| SIC | 0.467 (0.019) | 0.239 (0.013) | 0.874 (0.019) | 0.333 (0.054) | 0.624 (0.002) | 0.478 (0.018) |
| MovingTargets | 0.342 (0.085) | 0.265 (0.005) | 0.883 (0.029) | 0.327 (0.026) | 0.590 (0.003) | 0.577 (0.028) |
SIC exhibited equal or better , improved stability (lower variance), and faster approach to constraint satisfaction at higher values (C. et al., 2022).
5. Advantages, Limitations, and Discussion
Advantages
- Decoupling: By modifying targets instead of model parameters, SIC allows incorporation of any underlying regression architecture.
- Generalizability: Operates with arbitrary convex constraints , not limited to specific types or structures.
- Convergence: Proven fixed-point existence and linear convergence rate under weak assumptions.
- Numerical Stability: Empirically demonstrates reduced variability in cross-validation, especially at higher trade-off values .
Limitations
- Soft constraint satisfaction: Constraints are only approached asymptotically; strict hard satisfaction in finite iterations is not assured.
- Computational cost: Each iteration involves a potentially expensive convex projection and model retraining.
- Lipschitz continuity: The guarantee relies on the projection operators being Lipschitz; non-Lipschitz or nonconvex constraint sets remain outside the established theory.
- Limited guarantees beyond regression: Formal convergence guarantees are not currently established for classification or structured prediction tasks (C. et al., 2022).
A plausible implication is that further developments might address projection in nonconvex or non-Lipschitz contexts, and extend the scheme to multi-output or non-regression settings.
6. Context and Relation to Constraint-Constrained Learning
SIC generalizes and formally unifies previous approaches such as Moving Targets (Detassis et al., ICML 2020) for mean-squared error loss under mild parameter correspondence. It is fundamentally connected to alternating projection methods (e.g., Dykstra’s algorithm with Bregman projections [Bauschke & Combettes 1997]) and draws on contraction mappings and the Banach fixed-point theorem for its guarantee [Ciesielski 2007]. Its technical flexibility places it distinct from methods that tie constraint enforcement directly to model parameter updates, thus facilitating ready deployment across model types so long as convex projections and unconstrained regression are accessible.
The separation of constraint projection and model learning steps and the analytical convergence underpin its utility in practical, stable enforcement of structural or fairness constraints in machine learning models, with typical applications in supervised regression (C. et al., 2022).