Controllable Feature Whitening (CFW)
- Controllable Feature Whitening (CFW) is a linear preprocessing method that decorrelates specified feature groups to balance predictive accuracy with fairness and interpretability.
- It employs tunable covariance blending and partial whitening to control trade-offs between removing spurious correlations and preserving informative signals.
- Empirical studies in fairness-sensitive deep learning and neuroimaging show that CFW significantly improves group fairness and model interpretability without sacrificing accuracy.
Controllable Feature Whitening (CFW) is a family of linear preprocessing and transformation methods designed to decorrelate specific sets of features in high-dimensional machine learning pipelines, providing control over the extent of whitening—ranging from no decorrelation to full feature sphering. CFW enables practitioners to remove spurious linear dependencies that degrade predictive robustness or interpretability, while maintaining flexible trade-offs with utility, fairness, or interpretability objectives depending on the application context. The framework is instantiated in both fairness-sensitive deep learning (Cho et al., 27 Jul 2025) and neuroimaging model interpretability (Petiton et al., 22 Apr 2026).
1. Mathematical Formulation and General Principles
CFW operates by quantifying and manipulating the linear correlations present between two or more feature groups using the empirical covariance matrix. Consider two groups: “target” features representing the predictive signal, and “bias” features encoding nuisance or confounding signals. These are concatenated to form . Given input instances, features are aggregated into a centered data matrix , from which the empirical covariance is computed:
The core operation is a whitening transformation
where is the symmetric inverse square root, obtained via eigen-decomposition. Whitening enforces , ensuring all pairwise linear covariances—including those between target and bias features—are removed. In group-structured data, such as neuroimaging, this operation is applied to predefined feature groups using blockwise transformations (Petiton et al., 22 Apr 2026).
CFW generalizes this by introducing a tunable parameter that blends between no transformation (the identity) and full whitening.
2. Controllable Re-Weighted Covariance and Regularization
In scenarios where full whitening would destroy informative dependencies or induce excessive loss of signal (e.g., if targets and biases are themselves correlated in the population), a convex blending of the empirical covariance with an uncorrelated or regularized version is introduced. In bias mitigation for classification:
0
where 1 is the covariance estimated under the observed label/bias distribution, and 2 is an “unbiased” covariance estimated by enforcing 3 to be uniform across classes and bias groups (Cho et al., 27 Jul 2025).
In neuroimaging, a regularization parameter 4 allows for partial whitening: 5 Here, 6 corresponds to the identity (no whitening), and 7 to exact whitening. Closed-form solutions exist for all interpolants due to the quadratic structure of the objective (Petiton et al., 22 Apr 2026).
3. Enforcement of Statistical Fairness and Interpretability
CFW’s parameterized whitening enables targeted enforcement of statistical independence objectives. Whitening with respect to 8 removes all linear correlation between the output and bias, enforcing demographic parity: 9 Whitening with respect to 0 instead conditions on the outcome variable, aligning with equalized odds: 1 By varying 2, practitioners can interpolate between these fairness criteria while trading off classification utility and strictness of independence.
In neuroimaging, the analogous regularization improves anatomical interpretability. Increasing 3 leads to greater alignment between learned model weights and biologically meaningful regions, while predictive accuracy remains stable across 4 (Petiton et al., 22 Apr 2026). A plausible implication is that CFW can systematically control the emergence of domain-relevant feature weightings under linear models.
4. Core Algorithmic Workflow and Computational Complexity
A typical single-layer CFW algorithm (in bias mitigation context) consists of:
- Feature extraction from pre-trained encoders for target and bias attributes.
- Computation of mean and covariance (biased and unbiased).
- Blending of covariance matrices via the trade-off parameter.
- Stable calculation of matrix roots (using, e.g., Newton–Schulz iteration).
- Transformation (whitening) and splitting of features, followed by classification.
Pseudocode from (Cho et al., 27 Jul 2025) and (Petiton et al., 22 Apr 2026) covers all steps, including optional covariate regression and cross-validation for parameter selection in non-deep learning settings. The dominant computational cost is 5 for covariance construction and 6 for matrix inversion, but feature dimensions in practical applications (7) render overhead minimal relative to overall model training.
5. Empirical Performance and Use-case Studies
Benchmark results demonstrate the domain-generic effectiveness of CFW.
| Benchmark Dataset | Setting | Notable CFW Gains |
|---|---|---|
| Corrupted CIFAR-10 | Image; spurious bias | CFW+Vanilla yields 3–8 pt. gains in unbiased accuracy |
| Biased FFHQ | Face age/gender | Bias-conflicting accuracy rises from 56.2% (Vanilla) to 79.8% (CFW) |
| WaterBirds | Bird species/background | Worst-group accuracy improves from 74.9% to 93.5% |
| Celeb-A | Attribute pairs | Group-gap reduced >15 pts., worst-group acc. of 84.0%/48.2% |
In all these experiments, setting the trade-off parameter (8) universally provided state-of-the-art or near-optimal results with no dataset-specific tuning. This suggests strong robustness of the blended covariance approach in bias mitigation pipelines (Cho et al., 27 Jul 2025).
In neuroimaging, application to psychiatric classification (bipolar vs. healthy control; schizophrenia vs. healthy control) confirmed that:
- Predictive ROC-AUC and balanced accuracy are invariant to the CFW parameter (9).
- Interpretability, quantified by overlap with meta-analytic region rankings, increaseswith whitening, peaking for 0. Thus CFW identifies biologically-plausible feature importances without sacrificing discriminative performance (Petiton et al., 22 Apr 2026).
6. Practical Integration, Application Scenarios, and Extensions
CFW is implemented as an independent preprocessing or intermediate pipeline stage. In deep networks, whitening transformations are applied to extracted representations prior to linear classifier layers, with the feature-encoder parameters optionally frozen. In linear-model applications (e.g., neuroimaging), CFW is deployed as a fixed map estimated from training data and applied to both training and evaluation sets, decoupling unsupervised feature transformation from supervised weight estimation.
The framework admits natural integration with cross-validation procedures, ridge or Ledoit–Wolf covariance shrinkage for regularization, and optional structural smoothness constraints (e.g., via graph-Laplacian penalties) in high-dimensional feature spaces (Petiton et al., 22 Apr 2026).
A plausible implication is that CFW provides a general-purpose tool for decorrelating features in any domain where groupwise covariance encodes spurious dependencies, providing interpretable tradeoffs between invariance and information preservation.
7. Theoretical and Empirical Limitations
CFW removes only linear correlations. While this is sufficient to enforce fairness or interpretability under linear classification or regression, nonlinear dependencies may persist. However, explicitly modeling higher-order interactions is intractable for high-dimensional features, so CFW offers a computationally favorable compromise (Cho et al., 27 Jul 2025). Regularization parameters (1, 2) provide interpretable, hyperparameter-free control over the spectrum of decorrelation, but care must be taken in domains where critical signal is embedded in covariance structure.
A second limitation is the reliance on meaningful feature grouping or splitting (target vs. bias, or anatomical regions). Performance and interpretability depend on plausible encoder or group definitions. Nevertheless, data demonstrate that CFW is robust to choices of the controlling parameter and generalizes across dataset domains (Cho et al., 27 Jul 2025, Petiton et al., 22 Apr 2026).
References:
"Controllable Feature Whitening for Hyperparameter-Free Bias Mitigation" (Cho et al., 27 Jul 2025) "Improving clinical interpretability of linear neuroimaging models through feature whitening" (Petiton et al., 22 Apr 2026)