Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 153 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 76 tok/s Pro
Kimi K2 169 tok/s Pro
GPT OSS 120B 441 tok/s Pro
Claude Sonnet 4.5 39 tok/s Pro
2000 character limit reached

Sparse Deep Additive Model with Interactions

Updated 30 September 2025
  • The SDAMI framework combines sparse additive modeling and deep neural networks to isolate main effects and interactions in high-dimensional regression.
  • Its two-stage process leverages marginal screening and structured regularization to achieve both interpretability and high predictive accuracy.
  • Applications in neuroscience and medical diagnostics demonstrate SDAMI's effectiveness in identifying key variables and complex interactions.

The Sparse Deep Additive Model with Interactions (SDAMI) is a statistical learning framework that combines the interpretability and sparsity of additive models with the representational flexibility of deep neural networks, while explicitly disentangling main effects from interaction effects in high-dimensional regression. SDAMI operates under the principle that relevant interactions leave detectable marginal footprints and deploys a two-stage strategy that uses sparsity-driven variable screening, structured regularization, and modular neural subnetworks to achieve both high predictive accuracy and interpretability across settings with limited samples and large feature sets (Hung et al., 27 Sep 2025).

1. Structural Decomposition and Motivation

SDAMI seeks to address the challenges posed by “small nn, large pp” data, where complex nonlinear dependencies must be modeled in a form that remains transparent and sparse. The regression function is decomposed as

Yi=jMfj(Xij)+f(Xi,I)+εi,Y_i = \sum_{j \in \mathcal{M}} f_j(X_{ij}) + f(\mathbf{X}_{i,\mathcal{I}}) + \varepsilon_i,

where M\mathcal{M} indexes main effects, I\mathcal{I} the set of variables appearing primarily in interactions, and fjf_j, ff are nonlinear component functions. Each selected main effect jj is assigned a dedicated subnetwork approximating fj()f_j(\cdot), while interaction subnetworks are constructed only for those groups I\mathcal{I} where data justifies such complexity. This contrasts with conventional deep models, where functional entanglement precludes clear attribution of variable influence.

The architecture is modular: all subnetworks are learned jointly but operate on disjoint low-dimensional projections of input, supporting both scalability and interpretability. The model thus balances expressivity (by allowing nonlinear subnetworks) and parsimony (via sparsity constraints and effect disentanglement).

2. Effect Footprint and Marginal Screening Principle

Central to SDAMI is the effect footprint concept. Even when a variable enters only through interactions, it typically leaves a marginal projection:

mk(x)=E[f(XI)Xk=x].m_k(x) = \mathbb{E}[f(\mathbf{X}_{\mathcal{I}}) \mid X_k = x].

If mk(x)m_k(x) is nonconstant, a marginal effect manifests—a property exploited to detect both main effects and interaction-only variables. In SDAMI's first stage, a sparse additive screening (e.g., via SpAM) identifies all variables with a nonzero marginal signal, producing an active set S^\widehat{\mathcal{S}} that contains both genuine main effects and variables active solely through interactions.

This leverages a key property: higher-order interactions typically project residual signal onto univariate marginals, providing a theoretically justified screen for later refinement.

3. Two-Stage Estimation and Variable Partitioning

SDAMI estimation comprises:

  1. Effect Footprint Screening: Fit a sparse additive model to all univariate components. Variables with nontrivial fitted functions are included in the active set S^\widehat{\mathcal{S}}.
  2. Partitioning and Regularization: Within S^\widehat{\mathcal{S}}, variables are partitioned into estimated main effects M^\widehat{\mathcal{M}} and footprint variables I^\widehat{\mathcal{I}} (variables that only manifest through interactions). A structured regularization—typically a group lasso with basis expansion—is then applied:

minθ1ni=1n[YijNN(j)(Xij;θj)NN(I)(Xi,I;θI)]2,\min_\theta \frac{1}{n} \sum_{i=1}^n \left[ Y_i - \sum_j \mathrm{NN}^{(j)}(X_{ij}; \theta_j) - \mathrm{NN}^{(\mathcal{I})}(\mathbf{X}_{i,\mathcal{I}}; \theta_{\mathcal{I}}) \right]^2,

subject to layerwise penalties:

Wm,j(1)κmfj,WI,j(1)κIfI.\|W^{(1)}_{m,j}\|_\infty \leq \kappa_m \|f_j\|,\qquad \|W^{(1)}_{\mathcal{I},j}\|_\infty \leq \kappa_{\mathcal{I}} \|f_{\mathcal{I}}\|.

Here, NN(j)()\mathrm{NN}^{(j)}(\cdot) denotes a subnetwork for fjf_j, and NN(I)()\mathrm{NN}^{(\mathcal{I})}(\cdot) the multivariate subnetwork for selected interaction variables. Penalty parameters λ1,λ2\lambda_1, \lambda_2 controlling main-effect and interaction sparsity are optimized, for example via Mallow’s CpC_p or cross-validation.

Group lasso-like norm constraints act hierarchically: vanishing L2L_2 (or functional) norms prune irrelevant subnetworks, effecting both sparsity and interpretability.

4. Subnetwork Construction and Modular Regularization

Each main effect and each identified interaction group, if justified by the data, is assigned a neural subnetwork. For variable jj:

  • If jM^j \in \widehat{\mathcal{M}}, construct a univariate subnetwork approximating fj()f_j(\cdot).
  • If jI^j \in \widehat{\mathcal{I}}, those involved in non-additive effects, include in the multivariate interaction network.

This additive-modular approach enables each fjf_j and fIf_{\mathcal{I}} to be visualized or inspected directly, conferring interpretability unattainable with generic DNNs. Pruning, regularization, and the two-step estimation ensure only functionally relevant subnetworks remain active.

5. Adaptive Regularization and Penalty Selection

Structured regularization is deployed using norm-based constraints and/or group lasso-like penalties. For L2L_2 norm of the jjth subnetwork's weights Wm,jW_{m,j},

Wm,j(1)κmfj,\| W_{m,j}^{(1)} \|_\infty \leq \kappa_m \| f_j \|,

with κm\kappa_m tuned to balance shrinkage and approximation power. Main-effect and interaction subnetworks are regularized with independent hyperparameters, often optimized by cross-validation or Mallow’s CpC_p criterion:

  • λ1\lambda_1: penalizes complexity (or number) of main-effect subnetworks.
  • λ2\lambda_2: penalizes higher-order interactions.

This structure ensures that only variables or interactions justified by the data are selected.

6. Simulation Studies and Real-World Applications

Extensive simulation studies evaluate scenarios including strong main effects; both main and interaction effects; varying sparsity; and different sample sizes. SDAMI consistently recovers true signal structure with high TPR and low FPR across regimes.

In neuroscience, SDAMI was tested on fMRI data from the visual cortex where tens of thousands of Gabor-filtered features were present. The model identified spatial and orientation variables relevant for main effects, and further revealed meaningful variable combinations in interaction subnetworks, with visualized effect functions providing interpretable neuroscientific insight. In medical diagnostics (e.g., diabetes progression), SDAMI achieved superior prediction performance (MSE, R2R^2) compared to classical DNNs and other sparse alternatives, while correctly identifying minimal variable subsets and interactions.

7. Theoretical Underpinnings and Implications

SDAMI methodology is theoretically grounded in the effect footprint property: marginal projections of interaction terms are sufficient for initial variable screening, which is then refined via sparsity-inducing regularization. The approach leverages insights from minimax detection boundaries (Gayraud et al., 2010), adaptive group lasso theory, and hierarchical regularization frameworks, ensuring robustness even in regimes with high ambient dimensionality and negligible main effects.

A plausible implication is that SDAMI architectures provide a principled route to interpretable deep prediction in scientific domains where understanding specific variable effects and their conditional dependencies is as critical as achieving high predictive performance. By enforcing structured sparsity and modularization, SDAMI enables deep models to operate under statistical guarantees typically associated with classical additive models and variable selection approaches, while offering superior function approximation capabilities.

Summary Table: Key Components of SDAMI

Component Purpose Methodological Detail
Effect Footprint Marginal screen for all impactful variables mk(x)=E[f(XI)Xk=x]m_k(x) = \mathbb{E}[f(\mathbf{X}_{\mathcal{I}}) \mid X_k = x]
Two-Stage Procedure Identify and partition active set Footprint screening \rightarrow Group-lasso refinement
Modular Subnetworks Isolate and regularize each component Dedicated NNs per main effect and interaction group
Structured Regularization Induce and maintain sparsity Norm/lasso constraints on first-layer weights
Interpretability Direct effect visualization Subnetwork fjf_j/fIf_{\mathcal{I}} visualizable

In conclusion, Sparse Deep Additive Models with Interactions represent a convergence of deep learning, high-dimensional sparse estimation, and interpretable statistical modeling, offering a scalable and theoretically motivated solution for complex regression problems where both accuracy and transparent variable importance are paramount (Hung et al., 27 Sep 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Sparse Deep Additive Model with Interactions (SDAMI).