Multivariate Adaptive Regression Splines (MARS)
- Multivariate Adaptive Regression Splines (MARS) is a nonparametric method that builds regression models using adaptive, piecewise-linear basis functions to capture nonlinearities and interactions.
- It employs a forward–backward stepwise algorithm to add and prune hinge functions, balancing model fit and complexity via cross-validation.
- Extensions of MARS include Bayesian variants, fairness-aware algorithms, and hybrid models like SMART that handle discontinuities and high-dimensional challenges.
Multivariate Adaptive Regression Splines (MARS) is a nonparametric regression methodology that adaptively approximates unknown multivariate functions using piecewise-linear (hinge) basis functions. Introduced by Friedman (1991), MARS is constructed via a forward–backward stepwise procedure, generating models of the form , where each basis function is either a univariate hinge or a product of such hinges, allowing explicit modeling of nonlinearities and interactions. MARS underlies numerous extensions, including scalable Bayesian variants, fairness-aware algorithms, and recent hybrid models that address challenges of discontinuities and high-dimensional interactions (Pattie et al., 8 Oct 2024, Haghighat et al., 23 Feb 2024, Liu et al., 2023, Rumsey et al., 2023).
1. Core Model Structure and Basis Expansion
The MARS model represents a scalar response as a sum of adaptively chosen basis functions: where is the global intercept and each is either a univariate hinge or a low-degree interaction of such hinges (Pattie et al., 8 Oct 2024, Haghighat et al., 23 Feb 2024, Rumsey et al., 2023). Hinge functions are defined as
For interaction modeling, MARS permits basis functions of the form , where with cardinality equal to the interaction degree (Pattie et al., 8 Oct 2024, Liu et al., 2023).
By multiplying hinge functions, MARS constructs piecewise-linear surfaces capable of capturing abrupt changes in slope and localized nonlinear effects.
2. Forward–Backward Adaptive Algorithm
MARS fitting employs a greedy, two-stage procedure (Pattie et al., 8 Oct 2024, Haghighat et al., 23 Feb 2024, Liu et al., 2023, Rumsey et al., 2023):
- Forward Pass: Begin with the intercept. At each step, consider all combinations of existing basis functions, variables, and candidate knots, adding pairs of reflected hinge functions (or their products for supported interaction degrees) that most reduce the residual sum of squares (RSS). This process continues until a maximal number of terms is reached or further improvement is deemed negligible.
- Backward Pruning: The overfitted forward model is regularized via backward elimination. At each iteration, basis functions are deleted one at a time, with the term whose removal minimizes a generalized cross-validation (GCV) score being purged. The GCV criterion is typically
where is the effective degrees of freedom, generally (Haghighat et al., 23 Feb 2024, Rumsey et al., 2023).
This two-stage construction ensures adaptivity, parsimony, and a balance between fit and complexity.
3. Extensions: High-Dimensional Settings and Sufficient Dimension Reduction
With increased predictor dimensionality and higher-order interactions, the combinatorial explosion of candidate hinge products presents computational and statistical challenges (Liu et al., 2023). For a maximal interaction order , the space of possible basis functions scales as (where is the number of knots per variable), yielding low efficiency for large or .
Dimension reduction strategies, such as drMARS, exploit the calculation of function gradients via the MARS expansion to estimate effective low-dimensional subspaces: where the leading eigenvectors of specify a subspace containing the relevant variation for . Projecting to this subspace and refitting MARS in the reduced space yields the optimal minimax rate when the function depends on only directions, mitigating the curse of dimensionality and improving generalization (Liu et al., 2023).
4. Bayesian and Generalized Bayesian MARS
The Bayesian MARS (BMARS) framework recasts MARS model selection in a hierarchical Bayesian context. Priors are placed on the number and structure of basis functions, coefficients, and error variance: with reversible-jump MCMC moves for basis-function birth, death, and mutation (Rumsey et al., 2023).
Generalized Bayesian MARS (GBMARS) extends BMARS to encompass general error distributions by modeling residuals as Normal–variance mixtures, yielding generalized hyperbolic (GH) error models (including Gaussian, Student-, asymmetric Laplace, and Normal–Inverse-Gaussian as special cases). The GBMARS likelihood is thus highly flexible and supports robust, quantile, and heteroscedastic regression: Evaluations show that GBMARS provides calibrated uncertainty quantification and strong predictive performance in settings with outliers, skew, or heavy-tailed errors (Rumsey et al., 2023).
5. Fairness, Transparency, and Interpretability
MARS is inherently interpretable since each basis function corresponds to an explicit piecewise-linear rule, making variable selection, interaction structure, and threshold/knot placements transparent (Haghighat et al., 23 Feb 2024). Recent work integrates fairness constraints directly into basis selection, notably in fairMARS, where the standard RSS-based gain criterion in the forward stage is augmented with a penalty for subgroup disparities: This enforces accuracy-fairness tradeoffs during model construction, making MARS suitable for responsible, group-regularized modeling (Haghighat et al., 23 Feb 2024).
Interpretability is further enhanced by the explicit, linear form of the rules, facilitating auditability and stakeholder communication. Knot placements align with critical points in the predictor space, and the interaction structure remains easily traceable in the additive basis expansion.
6. Hybrid Models: Discontinuities and SMART
A known limitation of classical MARS is its tendency to smooth over abrupt discontinuities, as hinge bases cannot naturally capture jump behavior (Pattie et al., 8 Oct 2024). The Spline-based Multivariate Adaptive Regression Trees (SMART) framework addresses this by integrating a decision-tree partitioning mechanism that recursively segments the feature space at discontinuities, then fits MARS models within each region: where are data-driven regions and each is a MARS fit on that region. Empirically, SMART recovers exact discontinuity structure and maintains spline adaptivity on continuous subregions, improving performance over both standalone MARS and tree-based models on relevant benchmarks (Pattie et al., 8 Oct 2024).
7. Empirical Properties and Practical Aspects
Empirical investigations across synthetic and real-world data confirm that MARS and its variants offer a highly flexible and interpretable alternative to classic nonparametric and tree-based regression techniques. drMARS provides substantial gains in high-dimensional or subspace-structured problems; GBMARS delivers calibrated inference under non-Gaussian error distributions; SMART enables accurate modeling in the presence of discontinuities; and fairMARS ensures fidelity to equity constraints (Pattie et al., 8 Oct 2024, Haghighat et al., 23 Feb 2024, Liu et al., 2023, Rumsey et al., 2023).
Key practical recommendations include:
- Restrict the maximum number of basis functions and interaction order to preserve numerical stability (), as per established heuristics.
- Select model complexity and regularization parameters via cross-validation or grid search.
- Use gradient-based dimension reduction (drMARS) when interaction effects lie on low-dimensional subspaces.
- Apply fairness-augmented knot selection if equitable prediction is required.
MARS remains a foundational tool for adaptive, interpretable, and extensible regression in multivariate settings.