Supervised Iterative Computation (SIC)

Updated 4 December 2025

Supervised Iterative Computation (SIC) is a framework that alternates between unconstrained regression and explicit convex projection to enforce constraints such as fairness or physical laws.
It decouples model training from constraint enforcement, allowing any regression model to be used while ensuring convergence via a contraction mapping under mild assumptions.
Empirical evaluations demonstrate SIC’s improved stability, performance, and constraint satisfaction compared to standard regression and Moving Targets methods on benchmark datasets.

Supervised Iterative Computation (SIC) is an algorithmic framework for supervised learning under constraints, specifically targeted at regression tasks where the predicted outputs must satisfy arbitrary convex constraints, such as fairness, physics, or structural requirements. SIC formulates learning as an alternating sequence of unconstrained regression and explicit constraint enforcement by target adjustment and projection, and provides a convergence guarantee via contraction mapping provided certain mild assumptions hold. This decoupled approach allows the use of any off-the-shelf regression model and enables general handling of convex constraint sets.

1. Formal Mathematical Structure

Let $X\in\mathbb{R}^{n\times d}$ denote a set of $n$ input samples, $y\in\mathbb{R}^n$ the ideal target outputs, and $f(X, \theta)\in\mathbb{R}^n$ a parametric regression model with parameters $\theta\in\mathbb{R}^p$ . The SIC framework imposes a closed, convex feasible set $C\subset\mathbb{R}^n$ on the model outputs, encoding required constraints (e.g., fairness, structural properties), and is equipped with a loss function $L:\mathbb{R}^n\times\mathbb{R}^n\to\mathbb{R}_{\ge 0}$ , such as mean squared error (MSE) or mean absolute error (MAE).

Denote by $B = \{ f(X, \theta)\mid \theta\in\mathbb{R}^p \}$ the set of outputs achievable by the model. SIC operates via two fundamental operators:

Constraint projection: $P_{C,L}(u) = \arg\min_{z\in C} L(z, u)$ , projecting onto $C$ .
Model projection: $P_{B,L}(v) = \arg\min_{\hat{y}\in B} L(\hat{y}, v)$ , unconstrained retraining to track a given target $v$ .

An affine extension operator $h:\mathbb{R}^n\to\mathbb{R}^n$ , $h(x) = (1-\alpha) y + \alpha x$ , $\alpha\in[0,1)$ , blends the ideal and current model outputs.

The base SIC iteration is as follows, given the current model prediction $\hat{y}^i$ :

If $\hat{y}^i\notin C$ $\overset{y}{^}^{i} \in / C$ (infeasible),
- Construct the affine-extended target $y^\alpha = (1-\alpha) y + \alpha \hat{y}^i$ .
- Compute $z^i = P_{C,L}(y^\alpha)$ .
If $\hat{y}^i\in C$ $\overset{y}{^}^{i} \in C$ (feasible),
- Compute $z^i = \arg\min_{z\in C} L(z, y)$ subject to $L(z, \hat{y}^i) \le \beta$ .

Then update by solving the unconstrained regression problem $\hat{y}^{i+1} = P_{B,L}(z^i)$ .

The overall one-step update operator is

$\hat{y}^{i+1} = T(\hat{y}^i) = P_{B,L}(P_{C,L}(h(\hat{y}^i))).$

Initialization is via an unconstrained fit: $\hat{y}^1 = \arg\min_{\hat{y}\in B} L(\hat{y}, y)$ (C. et al., 2022).

2. Algorithm and Convergence

The SIC framework defines a strict contraction mapping provided the following conditions hold:

$B$ and $C$ are closed convex subsets of $\mathbb{R}^n$ .
Each projection operator $P_{A,L}$ ( $A=B$ or $C$ ) is Lipschitz continuous with constant $K\ge 1$ under the considered norm.

For $u,v\in B$ ,

$\|T(u) - T(v)\| \le K^2 \alpha \|u - v\|.$

Thus, if $K^2 \alpha < 1$ , $T$ is a contraction mapping. Invoking the Banach fixed point theorem, SIC has a unique fixed point $\bar{y}$ , and the iterates converge linearly with

$\|\hat{y}^i - \bar{y}\| \le (K^2 \alpha)^{i-1} \|\hat{y}^1 - \bar{y}\|.$

For MSE loss, $K=1$ , so convergence holds for any $\alpha\in[0,1)$ ; for MAE ( $L^1$ ), $K=2$ and $\alpha<1/4$ is required (C. et al., 2022).

This contraction property is central, ensuring that iterates approach a unique fixed point regardless of initialization, provided the normed operator is contractive.

3. Practical Implementation

A typical iteration scheme is summarized in the following pseudocode, as appearing verbatim in the formal exposition:

Input: X,y,C,α,β, max_iters

// 1) Initial unconstrained fit
ŷ[1] ← argmin_{ŷ∈B} L(ŷ, y)

for i in 1..max_iters–1:
   if ŷ[i] ∉ C:
     // infeasible adjustment
     yα ← (1−α)·y + α·ŷ[i]
     z ← argmin_{z∈C} L(z, yα)
   else:
     // feasible adjustment
     z ← argmin_{z∈C} L(z, y) subject to L(z, ŷ[i]) ≤ β
   end
   // unconstrained retraining
   ŷ[i+1] ← argmin_{ŷ∈B} L(ŷ, z)
end

return ŷ[max_iters]

The approach decouples the constraint-enforcement logic from the machine learning model, so any off-the-shelf regressor can be plugged into the iteration. Each iteration nonetheless requires solving a (convex) projection and retraining.

4. Empirical Evaluation and Benchmarking

SIC's practical impact has been demonstrated on regression tasks with fairness constraints. In these experiments, the constraint set $C$ was specified by the Disparate Impact Discrimination Index (DIDI): $\mathrm{DIDI}(z) = \sum_{\text{protected groups}} \left\lvert \frac{1}{n}\sum_{i=1}^n z_i - \frac{1}{|G|}\sum_{i\in G}z_i \right\rvert \le \epsilon.$ Benchmarks were conducted on three UCI-style datasets:

Student ( $n=649$ , target=final grade, protected=sex)
Crime ( $n=2215$ , target=crime rate, protected=race)
BlackFriday ( $n=50,000$ , target=purchase amount, protected=gender)

Performance was compared to unconstrained regression ( $\alpha=0$ ) and the Moving Targets algorithm of Detassis et al. Key metrics were the $R^2$ -score on held-out folds, constraint satisfaction $\mathcal{C} = \mathrm{DIDI}(\hat{y})/\mathrm{DIDI}(y)$ , and variability across cross-validation.

Abridged results (mean ± std over 5 folds, after 30 SIC iterations, for MAE loss, $\alpha=0.5$ ):

Algorithm	Crime $R^2$	Crime $\mathcal{C}$	Student $R^2$	Student $\mathcal{C}$	BF $R^2$	BF $\mathcal{C}$
SIC	0.467 (0.019)	0.239 (0.013)	0.874 (0.019)	0.333 (0.054)	0.624 (0.002)	0.478 (0.018)
MovingTargets	0.342 (0.085)	0.265 (0.005)	0.883 (0.029)	0.327 (0.026)	0.590 (0.003)	0.577 (0.028)

SIC exhibited equal or better $R^2$ , improved stability (lower variance), and faster approach to constraint satisfaction at higher $\alpha$ values (C. et al., 2022).

5. Advantages, Limitations, and Discussion

Advantages

Decoupling: By modifying targets instead of model parameters, SIC allows incorporation of any underlying regression architecture.
Generalizability: Operates with arbitrary convex constraints $C$ , not limited to specific types or structures.
Convergence: Proven fixed-point existence and linear convergence rate under weak assumptions.
Numerical Stability: Empirically demonstrates reduced variability in cross-validation, especially at higher trade-off values $\alpha$ .

Limitations

Soft constraint satisfaction: Constraints are only approached asymptotically; strict hard satisfaction in finite iterations is not assured.
Computational cost: Each iteration involves a potentially expensive convex projection and model retraining.
Lipschitz continuity: The guarantee relies on the projection operators being Lipschitz; non-Lipschitz or nonconvex constraint sets remain outside the established theory.
Limited guarantees beyond regression: Formal convergence guarantees are not currently established for classification or structured prediction tasks (C. et al., 2022).

A plausible implication is that further developments might address projection in nonconvex or non-Lipschitz contexts, and extend the scheme to multi-output or non-regression settings.

6. Context and Relation to Constraint-Constrained Learning

SIC generalizes and formally unifies previous approaches such as Moving Targets (Detassis et al., ICML 2020) for mean-squared error loss under mild parameter correspondence. It is fundamentally connected to alternating projection methods (e.g., Dykstra’s algorithm with Bregman projections [Bauschke & Combettes 1997]) and draws on contraction mappings and the Banach fixed-point theorem for its guarantee [Ciesielski 2007]. Its technical flexibility places it distinct from methods that tie constraint enforcement directly to model parameter updates, thus facilitating ready deployment across model types so long as convex projections and unconstrained regression are accessible.

The separation of constraint projection and model learning steps and the analytical convergence underpin its utility in practical, stable enforcement of structural or fairness constraints in machine learning models, with typical applications in supervised regression (C. et al., 2022).

Markdown Report Issue Upgrade to Chat

References (1)

Iterative Supervised Learning for Regression with Constraints (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Supervised Iterative Computation (SIC).

Supervised Iterative Computation (SIC)

1. Formal Mathematical Structure

2. Algorithm and Convergence

3. Practical Implementation

4. Empirical Evaluation and Benchmarking

5. Advantages, Limitations, and Discussion

Advantages

Limitations

6. Context and Relation to Constraint-Constrained Learning

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Supervised Iterative Computation (SIC)

1. Formal Mathematical Structure

2. Algorithm and Convergence

3. Practical Implementation

4. Empirical Evaluation and Benchmarking

5. Advantages, Limitations, and Discussion

Advantages

Limitations

6. Context and Relation to Constraint-Constrained Learning

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research