Causal Manifold Fairness (CMF)
- Causal Manifold Fairness (CMF) is a framework that defines fairness as invariant manifold geometry in latent space using causal interventions.
- It employs autoencoder models with metric tensor and curvature constraints to align representations across sensitive attribute interventions.
- The approach optimizes task performance while penalizing geometric discrepancies, clearly quantifying the fairness-utility trade-off.
Causal Manifold Fairness (CMF) is a framework for representation learning in which fairness is defined and enforced at the level of manifold geometry in latent space, taking explicit account of the causal effects of sensitive attributes on the data-generating process. Rather than treating group membership as a simple shift or perturbation of data distributions, CMF posits and operationalizes the causal warping of the data manifold itself. By constraining the local Riemannian geometry—quantified via metric tensors and curvature—of autoencoder representations to remain invariant across counterfactual interventions on sensitive attributes, CMF enables geometric invariance that translates into downstream counterfactual fairness, while also explicitly quantifying the fairness-utility trade-off via geometric metrics (Rathore, 6 Jan 2026).
1. Latent Manifolds and Riemannian Geometry in Representation Learning
CMF is fundamentally built upon autoencoder-style models, with encoder mapping input data to a latent variable and decoder (generator) reconstructing the input from . Given the decoder , which is assumed to be a smooth map from latent space to data space , its image forms a differentiable manifold .
The geometry of this manifold is determined by how transforms local neighborhoods in : the metric tensor at is defined as the pullback of the Euclidean metric from :
where is the decoder Jacobian. The squared length in the data space for an infinitesimal tangent vector in latent space is then approximated by .
Curvature information, encoding second-order geometric structure, is given by the output-wise Hessians for . This decomposition provides a means to capture "bending" and "twisting" of the manifold under variations in , and is essential for the geometric invariances targeted by CMF.
2. Causal Modeling and Counterfactual Structure
CMF introduces a structural causal model (SCM) with the tuple , where denotes latent intrinsic variables, is the sensitive attribute (e.g., gender), denotes observed features, and denotes the target. The essential postulate is that the sensitive attribute causally "warps" the generative process , thereby affecting the geometry of the observed manifold.
Counterfactual interventions, , correspond to replacing 's value in the generative process and obtaining a counterfactual sample . Passing through the encoder yields counterfactual latent variables . The local geometry at , as captured by metric and Hessians , is required to match the geometry at under the original attribute value, for all :
for . This enforces invariance of geometric structure to counterfactual manipulations of .
3. Objective Functions and Geometric Regularization
CMF integrates geometric fairness directly into the training objective by imposing penalties on both metric and curvature discrepancies induced by . The total objective is:
where:
- comprises utility-driven losses: reconstruction loss and prediction loss (cross-entropy or regression on from ).
- is the Jacobian (metric) penalty:
aligning first-order geometry.
- is the Hessian (curvature) penalty:
enforcing invariance of second-order structure.
Hyperparameters determine the trade-off: increasing these reduces geometric bias (fairness violation) at the potential cost of utility (higher reconstruction/prediction loss). The fairness-utility trade-off is quantifiable via geometric errors and task metrics (Rathore, 6 Jan 2026).
4. Theoretical Guarantees and Interpretations
The central theoretical proposition of CMF is a geometric isometry guarantee under perfect alignment: if, for all and ,
then the decoder is locally an isometry between the manifolds parameterized by the intervention on . Consequently, data points that differ only in are mapped to regions of latent space exhibiting identical local metric and curvature, enabling any predictor on to inherit counterfactual fairness.
A Taylor-expansion argument further bounds the disparity in predicted outcomes under versus by the residual task loss and higher-order terms in 's derivatives, conditional on the fairness penalties being minimized. In practice, the framework yields a continuous fairness-utility trade-off curve as geometric regularization is increased.
5. Empirical Evaluation and Results
The CMF approach is validated on a synthetic SCM comprising a “warped Swiss roll,” where , , , and . This construction yields a data manifold whose tightness or twist varies with , exemplifying geometric warping due to the sensitive attribute.
Autoencoder architectures comprise 3-layer MLP encoders and decoders with ELU activations, implemented with smoothness sufficient for metric and curvature computations. Jacobians and Hessians are obtained via PyTorch autograd. The following metrics are used:
- Utility: classification accuracy on , reconstruction MSE,
- Fairness: , .
Representative results for are:
| Model | Acc (↑) | MSE (↓) | MetricErr (↓) | CurvErr (↓) |
|---|---|---|---|---|
| Baseline AE | 1.000 | 0.070 | 16.39 | 4.32 |
| CMF (ours) | 0.995 | 0.754 | 0.018 | 0.046 |
The baseline achieves perfect reconstruction but at the expense of high geometric error, effectively learning separate manifolds for each group. CMF, by contrast, produces nearly perfect task performance while dramatically reducing metric and curvature error, signifying near-perfect geometric invariance. As the regularization coefficients increase, geometric errors tend toward zero while MSE increases, quantifying the fairness-utility trade-off. Ablation experiments confirm that setting enforces only first-order fairness (small MetricErr but large CurvErr), while increasing reduces both errors at the cost of greater reconstruction error.
6. Illustrative Example and Algorithmic Workflow
A canonical toy example involves a scalar latent and a scalar output , with group-specific decoders: for , for . Jacobians and Hessians differ between groups (e.g., , ; , ), yielding nonzero metric and curvature errors. CMF optimizes for a common decoder with chosen to jointly minimize geometric penalties, balancing the two worlds according to the regularization parameters. In higher dimensions, this optimization is performed via gradient descent.
A representative training loop is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
for each minibatch {x_i, A_i, Y_i}: z_i = f_theta(x_i) xhat_i = g_theta(z_i) draw s' != A_i x_cf_i = f(U_i, s') z_cf_i = f_theta(x_cf_i) J_i = Jac(g, z_i) J_cf_i = Jac(g, z_cf_i) H_i_k = Hess(g_k, z_i) # for each output k H_cf_i_k = Hess(g_k, z_cf_i) # for each output k L_task = BCE(Y_i,decode→Y) + norm(x_i - xhat_i)**2 L_J = Frobenius(J_i - J_cf_i) L_H = sum_k Frobenius(H_i_k - H_cf_i_k) L_geo = lambda_J * L_J + lambda_H * L_H L = L_task + L_geo update theta by gradient descent on L |
This workflow realizes the end-to-end enforcement of geometric invariance under causal interventions, establishing the local isometry required for counterfactual fairness in learned representations (Rathore, 6 Jan 2026).