Tomographic Quantile Forests (TQF)
- Tomographic Quantile Forests are a nonparametric, tree-based regression method that estimates multivariate conditional distributions via directional quantile projections.
- It leverages the Cramér–Wold theorem and minimizes the sliced Wasserstein distance to capture arbitrary, nonconvex, and multimodal support shapes.
- The framework integrates an augmented QRF++ model with alternating convex optimization to deliver efficient, uncertainty-quantified predictions in multivariate regression.
Tomographic Quantile Forests (TQF) are a nonparametric, tree-based regression approach for uncertainty-quantified prediction in multivariate response problems, designed to learn and reconstruct the full conditional distribution of a vector-valued target using quantile estimation along arbitrary directions. TQF leverages all one-dimensional projections of the response, invoking the Cramér–Wold theorem to uniquely determine multivariate conditional laws, and reconstructs these distributions via sliced Wasserstein distance minimization. The framework integrates an augmented quantile forest model (“QRF++”) for efficient directional quantile regression with an alternating convex optimization procedure for distributional reconstruction, enabling flexible, nonconvex, and multimodal uncertainty representation without separately training models for each direction (Kanazawa, 18 Dec 2025).
1. Multivariate Regression and Problem Setting
Given covariates and multivariate responses , the objective is to recover the conditional law , not merely its mean or marginal characteristics. TQF exploits the mathematical property that the laws of all projected variables for , the unit sphere in , determine the law of through the Cramér–Wold device. This setup translates the multivariate distributional estimation problem into a continuum of one-dimensional quantile regression tasks, establishing the foundation for conditional distribution learning in arbitrary directions.
2. Directional Quantile Estimation in TQF
The key modeling step is learning the conditional -quantile function for every direction and quantile level , defined as:
Training aims to minimize the pinball (check) loss,
across augmented input data where directional and Fourier features are incorporated to capture all-projection dependence and higher-order distributional features.
TQF adopts the QRF++ backbone—an extension of Quantile Regression Forests—embedding as the input and using output targets comprising multiple quantiles and random Fourier features of . In training, for each data pair, independent random projection directions are sampled, and orthogonal rotations further augment the feature set. This produces records per sample, supporting tree-based partitioning that is sensitive to input and projection configuration.
Model symmetrization ensures the quantile function satisfies
by defining the symmetrized model
where is the raw quantile prediction forest.
3. Multivariate Distribution Reconstruction via Sliced Wasserstein Matching
After the forest is trained, for a test covariate , quantiles are predicted for each of randomly chosen directions at quantile levels:
The reconstruction task is then cast as finding a discrete empirical measure approximating by minimizing the (discrete) sliced 1-Wasserstein loss:
where denotes the -quantile of the projection under . The optimization alternates:
- Weight step: Solve the convex optimization of weights for fixed supports,
- Support step: Fit KDE to current weighted cloud and resample supports,
- Ensemble merging: Run parallel alternations, combine supports, and prune low-weight or redundant points via loss minimization.
The process is summarized in the following pseudocode:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
Input:
D_slice = { (n_k, q_m, Q_{k,m}) } for k=1..K, m=1..M
N0 = initial support size, N1 = regular support size, E = ensemble size
Output:
Weighted point cloud Z_merged
1. Initialize z_j (j=1..N0)
2. Uniform weights w_j ← 1/N0
3. Optimize {w_j} to minimize L({(w_j, z_j)}, D_slice)
4. Repeat until convergence:
a. Fit KDE to current cloud Z
b. Sample N1 new {z_j} from KDE
c. Optimize weights {w_j} on new support
d. Update loss, check for decrease
5. For e=1..E in parallel:
a. Sample N1 from final KDE → {z_j^(e)}
b. Optimize weights {w_j^(e)} on {z_j^(e)}
c. Collect all (w_j^(e), z_j^(e)) into Z_*
6. Prune Z_*:
For ℓ, keep top ℓ points, renormalize, compute L_ℓ
Choose ℓ* minimizing L_ℓ
Return top ℓ* points |
4. Computational Complexity and Algorithmic Characteristics
The main training phase requires tree induction for trees across samples, yielding complexity per coordinate due to ensemble-based splitting. Quantile predictions require operations for query directions.
QMEM reconstruction involves a convex optimization in variables ( per solve, or in the worst case), repeated over alternations and ensembles. With practical settings and , this results in manageable per-query cost.
Typically, directions suffice for ; in higher dimensions, quasi-Monte Carlo sampling or spherical designs may be employed for improved projection coverage.
5. Theoretical Properties and Methodological Comparisons
TQF’s QMEM stage produces an empirical measure minimizing the sliced-1-Wasserstein distance to the model-predicted projected quantiles. The sliced Wasserstein loss is convex with respect to the weights, with global convergence and stability under gradient-based optimization.
Under standard forest honesty and sufficient data (with and ), consistency holds for the estimated directional quantiles, and the QMEM reconstruction converges in sliced Wasserstein distance.
Classical Directional Quantile Regression (DQR) fits separate (typically linear) models for each direction, intersecting quantile-defining halfspaces to obtain only convex central regions, which cannot represent nonconvex or multimodal conditional supports. TQF overcomes these limitations by modeling all directions simultaneously via a nonparametric forest, imposing no convexity or unimodality restriction, and capturing arbitrary support shapes (e.g., two moons, annuli, regions with holes). TQF thus generalizes DQR approaches by enabling efficient joint estimation and reconstruction without restrictive assumptions.
6. Practical Aspects and Implementation Guidance
Quantile regions at level and direction are obtained as the halfspace . Central conditional regions emerge as intersections over multiple random directions.
To sample from the reconstructed , one simply draws a support point with probability from the final QMEM, optionally adding Gaussian noise for smoothness.
Essential hyperparameters include:
- (trees in forest): 50–200
- (number Fourier frequencies): 0–10
- (sample/feature augmentations): 5–20
- (quantile and projection discretizations): 20–50
- (support sizes for QMEM): ,
- (ensemble runs in QMEM): 10–20
Larger improve directional dependence capture but increase training cost. controls granularity of quantile reconstruction, with higher values resolving finer features. sets a tradeoff between reconstruction fidelity and speed. should balance expressivity for multimodal supports against computational burden in the convex optimization of weights.
TQF presents a framework for nonparametric, parallelizable, and distribution-free multivariate conditional uncertainty estimation, suited primarily to tabular data settings and providing an unrestricted, data-adaptive characterization of predictive uncertainty (Kanazawa, 18 Dec 2025).