Local Calibration via Regression Trees

Updated 29 August 2025

The paper demonstrates that regression trees partition feature spaces effectively, enabling local calibration adjustments that reduce systematic miscalibration errors.
Methods such as probability calibration trees and distributional regression trees offer practical solutions for aligning predictive outputs with observed outcomes across regions.
Extensions with linear aggregation, random effects, and probabilistic trees enhance scalability and precision in local calibration for diverse applications.

Local calibration via regression trees refers to methods that partition the covariate space and fit region-specific models, achieving more precise, locally-adapted calibration of probabilistic or deterministic predictions. This approach contrasts with global calibration techniques that fit a single transformation or model to the entire dataset. As a result, local calibration can correct systematic regional miscalibration and improve reliability, especially in heterogeneous or complex feature spaces. Methods based on regression trees and their generalizations have been adapted for local calibration in classification, regression, uncertainty quantification, post-hoc recalibration, and simulation-based inference.

1. Fundamental Concepts and Motivation

Local calibration using regression trees exploits the ability of trees to induce data-driven partitions of the feature space. Each partition (or leaf) corresponds to a sub-region in which the relationship between model predictions (e.g., scores, probabilities, or residuals) and the true outcomes can be assumed more stable or homogeneous relative to the full distribution. This enables the use of separate calibration models or parameterizations in different regions, reducing calibration error and addressing heterogeneity not captured by a global method. The approach is motivated by the observation that model miscalibration often exhibits local structure; for example, the mapping from predicted to true probability or residual variance varies with input covariates or in specific subpopulations. Local calibration via regression trees formalizes these intuitions by learning a partition of the input space and fitting local calibrators or predictors within each region (Leathart et al., 2018, Cabezas et al., 12 Feb 2024, Bettinger et al., 30 Jan 2025, Quentin et al., 7 Feb 2025, Cabezas et al., 23 Aug 2025).

2. Probabilistic Calibration Trees and Local Classifier Calibration

Probability calibration trees extend global post-processing methods—such as Platt scaling (sigmoid calibration) and isotonic regression—by embedding logistic calibration models within the nodes of a tree induced by the covariates (Leathart et al., 2018). The main steps are:

Tree Induction: A tree is induced on the original features (e.g., using C4.5), splitting the space into regions where local calibration is needed.
Local Logistic Calibration: For each leaf node, a logistic regression model (possibly using LogitBoost and warm starts from parent nodes) is fit to adjust the base model scores into calibrated probabilities. For multiclass problems, the additive logit model $P(y = j | x) = \frac{\exp(F_j(x))}{\sum_{i=1}^m \exp(F_i(x))}$ is applied, optionally with a log-odds transformation of probabilities.
Pruning: Instead of pruning by misclassification error, splits are pruned by the root mean squared error (RMSE) of the calibrated probabilities.
Inference: A test instance is routed down the tree by its features to the appropriate leaf, where the local logistic calibrator is applied.

Probability calibration trees are shown to achieve lower RMSE than global Platt scaling and isotonic regression when calibrating the output of a variety of base learners, especially in heterogeneous or poorly calibrated base models (e.g., naive Bayes) (Leathart et al., 2018).

3. Local Distributional Calibration in Regression

Local calibration for regression seeks not just to align predicted quantiles with empirical quantiles, but to ensure that the predicted conditional distributions accurately reflect local uncertainty. This can be achieved either directly using trees for local partitioning or by post-hoc adjustments:

Distribution Calibration: The regression model outputs a predictive distribution (e.g., Gaussian with mean and variance). Calibration transforms—such as a Beta link function parametrized by a, b, c—are learned post-hoc to warp the predicted CDF into better agreement with observed targets (Song et al., 2019). The parameters of the calibrator may be fitted globally, or using a local, input-dependent mapping (e.g., a multioutput Gaussian Process on predicted mean and variance) to enable instance- or region-specific calibration.
Distributional Regression Trees: Recent work introduces trees that directly minimize distributional scoring rules such as the weighted interval score (WIS) or the continuous ranked probability score (CRPS) (Quentin et al., 7 Feb 2025). Here, each leaf returns a nonparametric distribution—either as a set of quantiles or as an empirical CDF—and splits are selected to optimize calibration-oriented objectives, using efficient data structures to enable scalability. Calibration can then be achieved per tree leaf, providing strong local guarantees on both quantile and distribution level calibration.

In both settings, regression trees facilitate local adaptation—either by creating the regions in which to calibrate, or by serving as the base model to be adjusted.

4. Local Calibration for Prediction Intervals and Uncertainty Quantification

Prediction intervals or credible sets require not only accurate center prediction but also precise quantification of uncertainty—ideally, intervals that have the desired (e.g., 90%) coverage locally in the covariate space. Several tree-based local calibration strategies have emerged:

Conformal Prediction with Trees: Conformal prediction methods can be combined with regression trees, partitioning the covariate space and assigning calibration scores in each leaf (Cabezas et al., 12 Feb 2024, Cabezas et al., 23 Aug 2025). For example, conformity scores (often |Y – μ̂(X)|, or a function thereof) are used to estimate local quantiles for intervals, ensuring $P[Y \in C(x) \mid X \in A] \ge 1-\alpha$ within region A. Random forests or tree ensembles can further smooth the estimated cutoffs for robust coverage.
Empirical Constrained Optimization and Calibration: Regression trees (as a VC-subgraph class) can be used to learn both lower and upper bounds (L(x), U(x)) for the prediction interval, with the tree structure providing the locality. A margin parameter is tuned to ensure that, with high probability, the true coverage matches the target, adjusting for the empirical overfitting of the coverage constraint (Chen et al., 2021).
Shape Regularity and Theoretical Guarantees: The geometric shape of tree partition elements (cells) is critical. If cell diameters and volumes are well-controlled (γ-shape-regularity), then the bias/variance trade-off is optimal, and concentration bounds ensure that pointwise prediction errors and thus local calibration are sharp (Bettinger et al., 30 Jan 2025).

This class of methods provides local, finite-sample guarantees, and supports nuanced calibration in the presence of heteroscedasticity or regional complexity.

5. Extensions: Piecewise Linear, Embedded Random Effects, and Probabilistic Trees

Several elaborations of regression trees further enhance local calibration:

Linear Aggregation in Leaves: Instead of fitting constants at leaves, regression trees or BART ensembles can fit local linear models (with ridge regularization), reducing tree depth and improving smoothness and interpretability in locally linear regions (Künzel et al., 2019, Prado et al., 2020). Simulation studies and real data confirm that this approach can achieve lower RMSE and more parsimonious models.
Embedded Random Effects: In hierarchical or correlated data contexts, trees can be augmented with random effects per terminal node (e.g., HE-BART), enabling group-specific local calibration within each region. This allows random effect variance components to be estimated in a fully Bayesian tree ensemble, yielding sharper uncertainty quantification and improved performance compared to global mixed models (Wundervald et al., 2022).
Probabilistic Regression Trees and Ensembles: Instead of a hard assignment of inputs to regions, probabilistic regression trees assign soft weights to regions using a kernel or density function. Tree predictions then become a smooth mixture, providing local adaptivity to both the partition and the degree of region overlap. Ensemble methods (e.g., bagged, boosted, or Bayesian additive models) combine multiple such trees for consistency and optimal bias-variance tradeoff (Seiller et al., 20 Jun 2024).

6. Interpretability, Post-hoc Local Calibration, and Expanded Use Cases

The interpretability of regression trees is leveraged in both explanatory and post-hoc recalibration contexts:

Local Model Calibration and Explanations: Decision trees can be fit locally to approximate the predictions of a complex model (e.g., a support vector regressor), serving as interpretable and locally calibrated surrogates. Empirical evidence shows that trees outperform LIME in explanation fidelity (measured by RMSE) in local regions (Thombre, 10 Apr 2024).
Post-hoc Local Recalibration with Trees: Local calibration error metrics (e.g., LCE) quantify region-specific calibration gaps. While kernel or feature-space neighborhoods are initially used for recalibration (e.g., LoRe), regression trees can serve as an alternative or complementary means of defining local recalibration regions, supporting more interpretable diagnosis and correction of local miscalibration (Luo et al., 2021).
Multivariate and Simulation-based Inference: Localized calibration methods—using trees to partition the space for computation of empirical probability integral transforms or posterior conformity scores—achieve region-specific correction of miscalibrated credible sets in high-dimensional or simulation-driven inference scenarios (Kock et al., 17 Sep 2024, Cabezas et al., 23 Aug 2025).

7. Practical Considerations, Scalability, and Theoretical Guarantees

Practical and theoretical issues in local calibration via regression trees include:

Computational Efficiency: Recent algorithms exploit advanced data structures (min-max heaps, Fenwick trees, weight-balanced binary trees) to enable the fast computation of entropy-type scoring rules (WIS, CRPS) and their updates during tree splitting (Quentin et al., 7 Feb 2025).
Sample Complexity and Finite-sample Guarantees: The use of VC-theory, shape regularity, and empirical process bounds ensures that pointwise and uniform calibration error decays at the optimal rate (up to logarithmic factors) for Lipschitz regression functions, even with data-dependent tree partitions (Bettinger et al., 30 Jan 2025, Chen et al., 2021).
Open-source Software and Benchmarks: Practical deployment is facilitated by accessible implementations, such as “Rforestry” for locally linear trees (Künzel et al., 2019) and “clover” for locally calibrated conformal prediction (Cabezas et al., 12 Feb 2024), both of which are compatible with ensemble methods, standard ML libraries, and hyperparameter tuning frameworks. Experimental results document improved coverage and reduced interval length versus standard baselines.

Local calibration via regression trees encompasses a diverse set of methods with strong empirical performance and theoretical foundations. By leveraging data-driven, interpretable partitioning of the feature space, these methods support flexible, accurate, and region-specific calibration for probabilistic outputs, prediction intervals, credible sets, and explanatory modeling. Trees can act as the primary means of partitioning, as local surrogates, as vehicles for post-hoc recalibration, or as the building blocks of advanced ensemble models, making them central to current developments in local calibration theory and practice.