Concurvity Regularization in Additive Models

Updated 22 April 2026

Concurvity regularization is a differentiable constraint that mitigates non-linear dependencies between feature transformations in additive models.
It penalizes pairwise Pearson correlations among transformed features to enforce near-orthogonality and improve interpretability.
Integrated into models like GAMs and NAMs, the method has been empirically shown to reduce attribution ambiguity and enhance feature stability.

Concurvity regularization refers to a class of differentiable constraints that act to reduce non-linear dependencies—termed "concurvity"—between the transformed feature representations in generalized additive models (GAMs) and, by extension, in any differentiable additive model. Concurvity is the non-linear generalization of multicollinearity, occurring when a nontrivial non-linear combination of features or their shape functions yields zero, thereby undermining the interpretability of the model and the stability of feature attributions. The principal approach to concurvity regularization is to penalize pairwise correlations among these non-linear feature mappings, thus promoting near-orthogonality in the transformed feature space and yielding interpretable, stable model decompositions that do not suffer from ambiguity or self-cancellation in feature contributions (Siems et al., 2023).

1. Definition and Manifestation of Concurvity in Additive Models

Formally, a standard generalized additive model (GAM) predicts via the expression

$g\bigl(\mathbb{E}[Y\mid X]\bigr) = \beta + \sum_{i=1}^p f_i(X_i),$

where $X_i \in \mathbb{R}^N$ denotes features, $f_i$ are the univariate "shape" functions, and $g$ is a link function. In this setting, concurvity is present if there exist functions $(g_1,\dots,g_p)$ in the admissible function class $\mathcal{H}$ and a constant $c_0$ such that

$c_0 + \sum_{i=1}^p g_i(X_i) = 0 \quad \text{(in } \mathbb{R}^N \text{),}$

with not all $g_i(X_i)$ being constant zero.

Concurvity, unlike linear multicollinearity, encompasses all (possibly non-linear) dependencies that allow transformed features to “cancel out” in the aggregate model sum. In practical applications, strong concurvity can manifest as ambiguities in the attribution of importance to features, high variance in shape function estimation, and diminished interpretability due to self-canceling contributions.

2. Mathematical Framework for Concurvity Regularization

Concurvity regularization is grounded in enforcing empirical orthogonality (zero Pearson correlation) among all pairs of transformed feature vectors $f_i(X_i)$ , $X_i \in \mathbb{R}^N$ 0, $X_i \in \mathbb{R}^N$ 1. The empirical Pearson correlation is defined as

$X_i \in \mathbb{R}^N$ 2

where $X_i \in \mathbb{R}^N$ 3 is the mean of vector $X_i \in \mathbb{R}^N$ 4. The regularizer itself is the average of the absolute pairwise correlations:

$X_i \in \mathbb{R}^N$ 5

By construction, $X_i \in \mathbb{R}^N$ 6 if and only if all pairs of transformed features are empirically uncorrelated. Enforcing this property prohibits nontrivial zero-sum non-linear combinations, conferring robustness against concurvity except for the trivial solution where all $X_i \in \mathbb{R}^N$ 7 are constants (Siems et al., 2023).

3. Integration into Differentiable Additive Modeling Workflows

The concurvity regularizer is integrated additively into the standard loss function for differentiable additive models, yielding the following learning objective:

$X_i \in \mathbb{R}^N$ 8

where $X_i \in \mathbb{R}^N$ 9 controls the trade-off between empirical risk minimization and orthogonality enforcement. In practice, $f_i$ 0 may be parametrized as differentiable neural blocks (e.g., MLPs in Neural Additive Models, NAMs, or Fourier-series blocks in NeuralProphet). The regularizer $f_i$ 1 is evaluated on each minibatch, allowing for stochastic but unbiased gradient-based optimization. The loss is propagated and optimized using standard frameworks (e.g., PyTorch, JAX) with explicit care for numerical stability in correlation computation.

Pseudocode Sketch:

$f_i$ 7 (Siems et al., 2023)

This approach is agnostic to whether data are tabular or temporal, since seasonality and trend blocks in time series models are also treated as univariate shape functions.

4. Theoretical Properties and Empirical Validation

The regularizer is theoretically justified: enforcing pairwise orthogonality between non-linear feature transforms guarantees—other than the degenerate case of all-constant transforms—the absence of nontrivial concurvity. In empirical studies, several benchmark tasks illustrate both the pathology of concurvity and the rectification via regularization:

In settings where $f_i$ 2, unregularized NAMs split the influence arbitrarily; regularization drives one feature to capture the effect, rendering feature attribution unique.
In the presence of deterministic non-linear dependency ( $f_i$ 3), concurvity is eliminated, and feature attributions become decorrelated without loss of target fit for moderate regularization strength.
In time series decomposition (e.g., NeuralProphet with daily/weekly components), concurvity regularization yields interpretable, non-overlapping shape functions corresponding to true underlying periodicities, unlike overparametrized or unconstrained models with self-canceling or ambiguous seasonal patterns.
Across multiple real tabular datasets, a moderate value of $f_i$ 4 reduces concurvity $f_i$ 5 by more than an order of magnitude, with only a minor increase (few-percent) in RMSE or AUC.

Setting	Without Regularization	With Concurvity Regularization
Perfectly correlated features	Ambiguous splits; high correlation	Unique feature attribution; low $f_i$ 6
Non-linear dependencies	Self-canceling shape functions	Decorrelation; interpretable sum
Real data (California Housing)	High variance in shape estimation	Stable shape plots; low variance

The stability of feature importance, as measured by variance across random initializations, is markedly improved once regularization is applied.

5. Applications and Context within Explainable Machine Learning

Concurvity regularization directly addresses one of the central interpretability challenges in additive modeling. By decorrelating the contributions of non-linear transformations, it provides more faithful feature importance estimates and eliminates self-cancellation phenomena. The technique generalizes across a variety of model classes—including classical GAMs, NAMs, and NeuralProphet—and is particularly impactful in domains (such as structured tabular or time series modeling) where feature dependencies are prevalent and model interpretability is paramount.

A plausible implication is that, as additive models continue to be deployed in domains demanding transparency (e.g., scientific and high-stakes decision-making), concurvity regularization will be indispensable for robust, interpretable inference in the presence of feature dependencies.

6. Summary and Limitations

Concurvity regularization is a conceptually simple, differentiable penalty on pairwise transformed-feature correlations. It can be integrated into any differentiable additive model estimator. The method significantly reduces non-linear dependencies among feature contributions and yields stable, interpretable decompositions with a mild trade-off in predictive accuracy when appropriately tuned. While enforcing exact orthogonality is too restrictive, the soft penalty provides a practical and effective compromise for modern gradient-based optimization frameworks. No substantial computational overhead is introduced, as correlations are efficiently estimated on minibatches (Siems et al., 2023).

Markdown Report Issue Upgrade to Chat

References (1)

Curve Your Enthusiasm: Concurvity Regularization in Differentiable Generalized Additive Models (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Concurvity Regularization.

Concurvity Regularization in Additive Models

1. Definition and Manifestation of Concurvity in Additive Models

2. Mathematical Framework for Concurvity Regularization

3. Integration into Differentiable Additive Modeling Workflows

4. Theoretical Properties and Empirical Validation

Table: Empirical Effects of Concurvity Regularization (Siems et al., 2023)

5. Applications and Context within Explainable Machine Learning

6. Summary and Limitations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Concurvity Regularization in Additive Models

1. Definition and Manifestation of Concurvity in Additive Models

2. Mathematical Framework for Concurvity Regularization

3. Integration into Differentiable Additive Modeling Workflows

4. Theoretical Properties and Empirical Validation

Table: Empirical Effects of Concurvity Regularization (Siems et al., 2023)

5. Applications and Context within Explainable Machine Learning

6. Summary and Limitations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics