Average Marginal Component Effects (AMCEs)
- AMCEs are formal measures that quantify the expected change in outcomes when a feature is altered, averaging over other variables.
- They are computed by contrasting predictions at different feature levels using causal models, ensuring isolated marginal effects even in complex architectures.
- AMCEs are pivotal in applications such as conjoint analysis, policy evaluation, and transparent feature attribution in advanced predictive models.
Average Marginal Component Effects (AMCEs) are formal quantities for expressing the expected causal influence of feature manipulations in multivariate predictive and causal models. Originally developed in the context of conjoint analysis and widely adopted across causal inference, AMCEs operationalize the expected change in an outcome when a single feature is perturbed—from one value or "level" to another—averaged over the empirical or structural distribution of all other features. In the structural framework, AMCEs provide a rigorous, population-level measure of direct and indirect effects, foundational for both interpretability and causal analysis, and are central to evaluating feature contributions in complex models, including deep neural networks (Thielmann et al., 11 Apr 2025).
1. Formal Definition and Context
Let denote a model or data-generating function, where with indicating the feature of interest and the remaining components. The marginal feature effect of variable at value is defined as
which captures the predictive expectation for setting to , marginalizing over the conditional distribution of context features.
Within conjoint analysis and causal inference, the AMCE from level to is formalized as
where is a potential outcome or structural model, and the expectation is with respect to the empirical or counterfactual distribution of the context (Thielmann et al., 11 Apr 2025). This estimate prescribes the expected shift in predicted (or observed) outcomes from changing from to with all other features sampled as observed.
2. AMCEs in Additive and Deep Models
Classical generalized additive models (GAMs) with link are expressed as
which yields interpretable, additive effects for each feature. In such models, with centered basis functions (i.e., of zero mean), the isolated marginal effect for feature at is , and the AMCE between and is . This direct interpretability is typically lost in complex, high-capacity models such as deep neural networks, where feature interactions and non-additivity obscure marginal effects (Thielmann et al., 11 Apr 2025).
Recent adaptations, such as the NAMformer—a tabular transformer network with explicit additive paths—restore the identifiability of marginal effects. Each feature is encoded via an uncontextualized embedding , passed through a shallow MLP (shape function), with the model structure:
Here, incorporates context via contextualized embeddings, but depends solely on via its embedding (Thielmann et al., 11 Apr 2025). After mean-centering, provides the marginal effect for .
3. Extraction and Computation of AMCEs
After model training, computation of marginal effect curves for feature proceeds as follows:
- Select a grid spanning the support of .
- For each , compute the uncontextualized embedding , then the shape output .
- The function describes the estimated marginal effect curve.
The empirical AMCE is calculated as
This mechanism yields AMCEs directly as differences in shape-function outputs, requiring no post-hoc intervention or bespoke estimation procedures (Thielmann et al., 11 Apr 2025). If is continuous, finite differences of the shape function approximate the instantaneous marginal effect.
4. Theoretical Guarantees and Identifiability
The identifiability of AMCEs in the NAMformer is ensured algorithmically by employing independent dropout across the per-feature shape networks during training. With dropout masks randomly masking each and the context head , the risk decomposes as:
For a convex loss , the following bound guarantees recovery:
where is the probability that only is active. As during training, each converges in population risk to the true conditional mean , thereby restoring isolated marginal effects (Thielmann et al., 11 Apr 2025).
5. Applications and Methodological Impact
AMCEs furnish an interpretable and theoretically sound measure for evaluating the marginal importance of features in both classical and contemporary high-capacity models. In tabular transformer architectures such as the NAMformer, AMCE computation is intrinsic to the model structure, supporting transparent analysis and causal attribution. This approach circumvents the opacity of general black-box models while preserving competitive predictive performance, effectively bridging two major paradigms in statistical learning: interpretability and accuracy (Thielmann et al., 11 Apr 2025).
AMCEs are particularly pivotal in conjoint analysis, policy evaluation, and any domain requiring principled assessment of intervention effects at the population level. The ability to efficiently extract marginal effect estimates within transformer-based frameworks represents a significant advance for interpretable AI in tabular settings.
6. Connections to Other Marginal Effect Formalisms
AMCEs closely relate to other measures of marginal effects, such as average treatment effects (ATE) in causal inference, partial dependence plots (PDPs) in machine learning, and marginal effect curves in GAMs. In the additive regime, these concepts coincide; in interaction-heavy models, AMCEs provide an average over observed distributions, maintaining the precise interpretability needed for rigorous analysis. The NAMformer directly recovers AMCEs in a manner analogous to classical GAM-based or causal estimators but embedded within a high-performance, context-aware architecture (Thielmann et al., 11 Apr 2025).