Papers
Topics
Authors
Recent
Search
2000 character limit reached

Concurvity Regularization in Additive Models

Updated 22 April 2026
  • Concurvity regularization is a differentiable constraint that mitigates non-linear dependencies between feature transformations in additive models.
  • It penalizes pairwise Pearson correlations among transformed features to enforce near-orthogonality and improve interpretability.
  • Integrated into models like GAMs and NAMs, the method has been empirically shown to reduce attribution ambiguity and enhance feature stability.

Concurvity regularization refers to a class of differentiable constraints that act to reduce non-linear dependencies—termed "concurvity"—between the transformed feature representations in generalized additive models (GAMs) and, by extension, in any differentiable additive model. Concurvity is the non-linear generalization of multicollinearity, occurring when a nontrivial non-linear combination of features or their shape functions yields zero, thereby undermining the interpretability of the model and the stability of feature attributions. The principal approach to concurvity regularization is to penalize pairwise correlations among these non-linear feature mappings, thus promoting near-orthogonality in the transformed feature space and yielding interpretable, stable model decompositions that do not suffer from ambiguity or self-cancellation in feature contributions (Siems et al., 2023).

1. Definition and Manifestation of Concurvity in Additive Models

Formally, a standard generalized additive model (GAM) predicts via the expression

g(E[YX])=β+i=1pfi(Xi),g\bigl(\mathbb{E}[Y\mid X]\bigr) = \beta + \sum_{i=1}^p f_i(X_i),

where XiRNX_i \in \mathbb{R}^N denotes features, fif_i are the univariate "shape" functions, and gg is a link function. In this setting, concurvity is present if there exist functions (g1,,gp)(g_1,\dots,g_p) in the admissible function class H\mathcal{H} and a constant c0c_0 such that

c0+i=1pgi(Xi)=0(in RN),c_0 + \sum_{i=1}^p g_i(X_i) = 0 \quad \text{(in } \mathbb{R}^N \text{),}

with not all gi(Xi)g_i(X_i) being constant zero.

Concurvity, unlike linear multicollinearity, encompasses all (possibly non-linear) dependencies that allow transformed features to “cancel out” in the aggregate model sum. In practical applications, strong concurvity can manifest as ambiguities in the attribution of importance to features, high variance in shape function estimation, and diminished interpretability due to self-canceling contributions.

2. Mathematical Framework for Concurvity Regularization

Concurvity regularization is grounded in enforcing empirical orthogonality (zero Pearson correlation) among all pairs of transformed feature vectors fi(Xi)f_i(X_i), XiRNX_i \in \mathbb{R}^N0, XiRNX_i \in \mathbb{R}^N1. The empirical Pearson correlation is defined as

XiRNX_i \in \mathbb{R}^N2

where XiRNX_i \in \mathbb{R}^N3 is the mean of vector XiRNX_i \in \mathbb{R}^N4. The regularizer itself is the average of the absolute pairwise correlations:

XiRNX_i \in \mathbb{R}^N5

By construction, XiRNX_i \in \mathbb{R}^N6 if and only if all pairs of transformed features are empirically uncorrelated. Enforcing this property prohibits nontrivial zero-sum non-linear combinations, conferring robustness against concurvity except for the trivial solution where all XiRNX_i \in \mathbb{R}^N7 are constants (Siems et al., 2023).

3. Integration into Differentiable Additive Modeling Workflows

The concurvity regularizer is integrated additively into the standard loss function for differentiable additive models, yielding the following learning objective:

XiRNX_i \in \mathbb{R}^N8

where XiRNX_i \in \mathbb{R}^N9 controls the trade-off between empirical risk minimization and orthogonality enforcement. In practice, fif_i0 may be parametrized as differentiable neural blocks (e.g., MLPs in Neural Additive Models, NAMs, or Fourier-series blocks in NeuralProphet). The regularizer fif_i1 is evaluated on each minibatch, allowing for stochastic but unbiased gradient-based optimization. The loss is propagated and optimized using standard frameworks (e.g., PyTorch, JAX) with explicit care for numerical stability in correlation computation.

Pseudocode Sketch:

fif_i7 (Siems et al., 2023)

This approach is agnostic to whether data are tabular or temporal, since seasonality and trend blocks in time series models are also treated as univariate shape functions.

4. Theoretical Properties and Empirical Validation

The regularizer is theoretically justified: enforcing pairwise orthogonality between non-linear feature transforms guarantees—other than the degenerate case of all-constant transforms—the absence of nontrivial concurvity. In empirical studies, several benchmark tasks illustrate both the pathology of concurvity and the rectification via regularization:

  • In settings where fif_i2, unregularized NAMs split the influence arbitrarily; regularization drives one feature to capture the effect, rendering feature attribution unique.
  • In the presence of deterministic non-linear dependency (fif_i3), concurvity is eliminated, and feature attributions become decorrelated without loss of target fit for moderate regularization strength.
  • In time series decomposition (e.g., NeuralProphet with daily/weekly components), concurvity regularization yields interpretable, non-overlapping shape functions corresponding to true underlying periodicities, unlike overparametrized or unconstrained models with self-canceling or ambiguous seasonal patterns.
  • Across multiple real tabular datasets, a moderate value of fif_i4 reduces concurvity fif_i5 by more than an order of magnitude, with only a minor increase (few-percent) in RMSE or AUC.
Setting Without Regularization With Concurvity Regularization
Perfectly correlated features Ambiguous splits; high correlation Unique feature attribution; low fif_i6
Non-linear dependencies Self-canceling shape functions Decorrelation; interpretable sum
Real data (California Housing) High variance in shape estimation Stable shape plots; low variance

The stability of feature importance, as measured by variance across random initializations, is markedly improved once regularization is applied.

5. Applications and Context within Explainable Machine Learning

Concurvity regularization directly addresses one of the central interpretability challenges in additive modeling. By decorrelating the contributions of non-linear transformations, it provides more faithful feature importance estimates and eliminates self-cancellation phenomena. The technique generalizes across a variety of model classes—including classical GAMs, NAMs, and NeuralProphet—and is particularly impactful in domains (such as structured tabular or time series modeling) where feature dependencies are prevalent and model interpretability is paramount.

A plausible implication is that, as additive models continue to be deployed in domains demanding transparency (e.g., scientific and high-stakes decision-making), concurvity regularization will be indispensable for robust, interpretable inference in the presence of feature dependencies.

6. Summary and Limitations

Concurvity regularization is a conceptually simple, differentiable penalty on pairwise transformed-feature correlations. It can be integrated into any differentiable additive model estimator. The method significantly reduces non-linear dependencies among feature contributions and yields stable, interpretable decompositions with a mild trade-off in predictive accuracy when appropriately tuned. While enforcing exact orthogonality is too restrictive, the soft penalty provides a practical and effective compromise for modern gradient-based optimization frameworks. No substantial computational overhead is introduced, as correlations are efficiently estimated on minibatches (Siems et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Concurvity Regularization.